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Superoxide dismutase-1 (SOD1) is a ubiquitous, Cu and Zn binding, 
free radical defense enzyme whose misfolding and aggregation play a 
potential key role in amyotrophic lateral sclerosis, an invariably fatal 
neurodegenerative disease. Over 150 mutations in SOD1 have been 
identified with a familial form of the disease, but it is presently 
not clear what unifying features, if any, these mutants share to 
make them pathogenic. Here, we develop several new computa- 
tional assays for probing the thermo-mechanical properties of both 
ALS-associated and rationally-designed SODlvariants. Allosteric in- 
teraction free energies between residues and metals are calculated, 
and a series of atomic force microscopy experiments are simulated 
with variable tether positions, to quantify mechanical rigidity "fin- 
gerprints" for SOD1 variants. Mechanical fingerprinting studies of 
a series of C-terminally truncated mutants, along with an analysis 
of equilibrium dynamic fluctuations while varying native constraints, 
potential energy change upon mutation, frustratometer analysis, and 
analysis of the coupling between local frustration and metal binding 
interactions for a glycine scan of 90 residues together reveal that the 
apo protein is internally frustrated, that these internal stresses are 
partially relieved by mutation but at the expense of metal-binding 
affinity, and that the frustration of a residue is directly related to 
its role in binding metals. This evidence points to apo SOD1 as 
a strained intermediate with "self-allostery" for high metal-binding 
affinity. Thus, the prerequisites for the function of SOD1 as an 
antioxidant compete with apo state thermo-mechanical stability, in- 
creasing the susceptibility of the protein to misfold in the apo state. 
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Abbreviations: (F.S)ALS, (familial, sporadic) amyotrophic lateral sclerosis; WT, wild- 
type; ZBL, Zn-binding loop; ESL, electrostatic loop; RMSF, root mean squared fluc- 
tuations; SASA, solvent-accessible surface area; PTM, post-translational modifica- 
tion; Cu.Zn(SS), holo, disulfide present; E,E(SH), apo, disulfide-reduced; WHAM, 
weighted histogram analysis method; SI, Supporting Information Appendix 

Allosteric regulation canonically involves the modulation 
of a protein's affinity for a given ligand A through the 
binding of a separate ligand B to a distinct spatial location on 
the protein. The modulatory binding site at the distinct lo- 
cation is referred to as the allosteric site, and the interaction 
between the allosteric site and the putative agonist binding 
site is referred to as an allosteric interaction. Early models 
of allostery were used to explain ligand saturation curves for 
hemoglobin in terms of subunit interactions that would induce 
binding cooperativity [1,2]. Cooperativity may be quantified 
through the non-additivity in the binding energies of ligand 
and allosteric effector [3]. More recently, allostery has been 
thought to be a more generic property present even in single 
domain proteins [4]. In this context, positive cooperativity 
may be modulated through frustrated intermediates [5] , which 
may enhance the conformational changes observed upon lig- 
and binding for allosteric proteins [6]. In a single domain 
protein, the notion of an allosteric effector can generalized 



to intrinsic protein side chains that can enhance protein func- 
tion or functionally important motion at the expense of native 
stability- a kind of "self-allostery" . Some allosteric activation 
mechanisms involve mediation of conformational switches by 
non-native intra-protein interactions [7] , an effect predicted by 
energy landscape approaches wherein barriers between confor- 
mational states are buffed to lower energies [8] . The potential 
for novel allosteric regulators may vastly broaden candidate 
targets for drug discovery [9]. The internal frustration re- 
quired for cooperative allosteric function may have deleteri- 
ous consequences however, if protein stability is sufficiently 
penalized in intermediate states to enhance the propensity for 
misfolding and subsequent aberrant oligomerization, processes 
known to be involved in neurodegenerative disease [10]. Here 
we show that the ALS-associated protein Cu, Zn superoxide 
dismutase (SOD1) is embroiled in such a conflict between sta- 
bility and function. 

SOD1 is a homo-dimeric antioxidant enzyme of 32kDa, 
wherein each monomer contains 153 amino acids, binds one 
Cu and one Zn ion, and consists largely of an eight stranded 
greek key f3 barrel with two large, functionally important 
loops [11,12,13]. Loop VII or the electrostatic loop (ESL, 
residues 121-142) enhances the enzymatic activity of the pro- 
tein by inducing an electrostatic funnel towards a redox active 
site centered on the Cu ion [14]. Loop IV or the Zn-binding 
loop (ZBL residues 49-83), contains histidines H63, H71, and 
H80 as well as D83, which coordinate the Zn ion and, along 
with a disulfide bond between C57 and C146, enforce concomi- 
tant tertiary structure in the protein. The Cu on the other 
hand is coordinated by H46, H48, and H120, and the bridg- 
ing histidine H63 between the Cu and Zn, which are located 
primarily in the Ig-like core of the protein. 

SALS inclusions are immunoreactive to misfolding-specific 
SOD1 antibodies [15,16]. Such misfolded aggregates may be 
initiated from locally (rather than globally) unfolded states 
that become accessible, for example, via thermal fluctuations 
or rare events [17], e.g. near-native aggregates or aggregation 
precursors were found for an obligate monomeric SOD1 vari- 
ant [18], and for FALS mutants S134N [19,20] and E,E(SS) 
H46R [19]. These above studies have motivated the present 
computational study, which focuses on the native and near- 
native thermo-mechanical properties of mutant, WT and post- 
translationally modified SOD1. 



Results 

Cumulative distributions of work values can discriminate 
SOD1 mutants and premature variants.. Mechanical probes 
were employed by simulating tethers on the residue closest 
to the center of mass of the SOD1 variant, and on various 
residues on the protein surface. The experimental analogue 
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Fig. 1. (Panel A) Force (inset a), Work (main panel), and effective modulus (inset b) as a function of extension. Tethers are placed at the C a atom closest to the center 
of mass of the SOD1 monomer (H46), and the C a atom of either residues G10 (red) or 117 (green), in separate pulling assays. Ribbon representations of the protein are also 
shown; the tethering residue is shown in licorice rendering (in red) and the center C a as a red sphere. The initial equilibrated (at A , green ribbon) and final (at 5 A , blue 
ribbon) structures are aligned to each other by minimizing RMSD. (Panel B) (Inset a) Work profiles of Cu,Zn(SS) WT (black), E.E(SH) WT (blue), and E,E G127X SOD1 
(red) vs. sequence index. Secondary structure schematic is shown underneath. (Main panel) Cumulative distributions of the work values in inset (a). E,E G127X is more stable 
than full-length E,E (SH) SOD1 (p=9e-7). (Inset b) Fraction of the 48 incidences that each variant had either the weakest, strongest, or middle work value, e.g. E,E(SH) 
SOD1 is weakest 80% of the time and is never the strongest variant. (Panel C) Cumulative work distributions for Cu.Zn (SS) WT (black), Cu,Zn G127X (green), E,E G127X 
(red), and Cu,Zn (SH) WT (cyan). Cu,Zn G127X is destabilized with respect to full-length Cu.Zn (SS) WT (p=6.2e-8). (Inset a) Same analysis as panel B, inset b, for the 
variants Cu.Zn(SS) WT, Cu.Zn G127X, and E,E G127X. (Panel D) Cumulative distributions for serine mutant SOD1 variants demonstrate that C-terminal truncation stabilizes 
the apo form but destabilizes the holo form (see text). 



to such an in silico approach would require multiple AFM 
or optical trap assays involving numerous residue pairs about 
the protein surface as tethering points. This is difficult and 
time-consuming to achieve in practice, which thus provides an 
opportunity for the present simulation approaches. All pro- 
teins in this study were taken in the monomeric form- many 
of them such as E,E(SH) SOD1, G127X, and G85R are ei- 
ther naturally monomeric or have significantly reduced dimer 
stability. The ALS-associated truncation mutant G127X con- 
tains a frameshift insertion which results in 6 non-native 
amino acids following Glyl27, after which a truncation se- 
quence terminates the protein 20 residues short of the putative 
C-terminus [21,22]. Here PTMs refer to processes involved in 
the in vivo maturation of SOD1, including disulfide bond for- 
mation, and metallation by Cu and Zn. 

The simulated force-extension profile may be used to ob- 
tain the work required to pull a given residue to a given dis- 
tance (Fig. 1A). In this assay, we found that large effective 



stiffness moduli were observed for small perturbing distances 
(Fig. 1A inset b). These were attributable primarily to side 
chain-side chain "docking" interactions in the native structure 
(SI Appendix, Fig. S2). The work to pull a given residue to 
a distance sufficient to constitute an anomalously large fluc- 
tuation, e.g. 5A, can be calculated as a function of sequence 
index for a given SOD1 variant, resulting in a characteristic 
mechanical profile or "mechanical fingerprint" for that protein 
(Fig. IB, inset a). The work values at 5A do not correlate 
with RMSF values of the corresponding residues [23] . A rep- 
resentative subset of 48 residues was obtained as described in 
the SI. We found that such a profile was independent of how 
the protein was initially constructed before equilibration, i.e. 
from what PDB, but was clearly different between WT SOD1 
and mutants such as G127X, as well as other PTM variants 
such as E,E(SH) SOD1 (SI Fig. S4). Table SI in SI gives the 
correlation coefficients between variants. A given PTM glob- 
ally modulates the mechanical profile, inducing both local and 
non-local changes in stability that can be both destabilizing 
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Fig. 2. (Panel A) Cumulative distributions of work values for C-terminal-truncated S0D1 variants of variable sequence length show a non-monotonic trend in mechanical 
stability. All variants are metal-depleted and have no disulfide bond. Sequences are given in the legend (see text). (Inset a) Comparing the cumulative distributions in the 
main panel, that of the mutant G127X is most commonly the strongest, full-length SOD1 is most commonly the weakest, and 1-140 is most often in the middle. (Inset b) 
Change in work value W MUT ~ W\yt averaged over residues, as a function truncation length, for the SOD1 variants in the main panel. (Inset c) Ribbon schematics of 
the various truncation mutants, colored blue to red from N- to C-terminus, labeled by C-terminal residue. (Panel B) Simulated native-basin dynamical fluctuations (RMSF) in 
explicit SPC solvent, for Cu.Zn(SS) (black) and E,E(SS) SOD1 monomer (red), along with the experimentally measured ratio of spectral density functions J (uu h) / J (oj n) 
of obligate monomeric E,E(SS) F50E/G51E/E133Q SOD1 (blue bars) [31]. Correlation coefficient is r = 0.78. (Panel C) Simulated RMSF for SOD1 variants E,E G127X 
(black), E,E(SS) (red), E,E(SH) (blue), and E,E(SH) with the ESL constrained to be natively structured (magenta). The presence of native stress is indicated by the increased 
disorder of the ZBL upon structuring the ESL (see text). (Panel D) Snapshots of typical structures of E,E(SH) and G127X SOD1 from equilibrium simulations, color coded by 
the mean RMSF for each residue; RMSF increases from blue to red according to the scale bars shown. 



in some regions and stabilizing in others. A detailed study 
of the mechanical consequences of mutations or PTMs, e.g. by 
studying the long-range communication through interaction 
networks in the mutant vs WT protein (see e.g. [24,25,26,27]) 
is an interesting topic of future research. 

We also found that the mechanical work profile was nearly 
independent of whether an explicit or implicit solvent model 
was used in the simulations, even though native basin fluctua- 
tions (RMSF) showed significant scatter in comparing implicit 
and explicit solvent models (SI Fig. S5). Equilibrium fluctua- 
tions are much more sensitive to solvent models than the large 
scale perturbations we consider here. On the other hand, a 
structure-based Go model does only a modest job of capturing 
the mechanical profile, and is poor in particular for residues 
where electrostatic interactions play a significant role in sta- 
bility (SI Fig. S6). Interestingly, the default energy scale of 1 
k J / mol for each contact or dihedral interaction used in current 



Go models [28] captures the overall energy scale of the work 
profile quite well for SOD1, though it must likely be modified 
for other proteins such as thermophilic proteins. 

In support of the thermodynamic relevance of a SOD1 
variant's mechanical fingerprint, we found that the non- 
equilibrium work values obtained by pulling a residue to 
5A strongly correlate with the equilibrium free energy change 
for the same process, as calculated using the WHAM method 
(r=0.96, SI Fig. S7). Thus, the work profiles obtained are an 
accurate measure of the corresponding local thermodynamic 
stability profile, up to a scaling factor. 

Comparing the mechanical fingerprints of E,E(SH) SOD1 
and that of G127X (Fig. IB, inset a), we see that many dis- 
crepancies between SOD1 variants and WT are difficult to 
disentangle solely from the work profile, or even from his- 
tograms of the work values (SI Fig. S3). However, the me- 
chanical discrepancies emerged quite naturally from the cumu- 
lative distribution of work values. Mechanical scans were thus 
used to construct the cumulative distributions, which then al- 
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lowed us to distinguish stabilizing energetics in various forms 
of PTM SOD1, e.g. between E,E(SH), G127X, and Cu,Zn(SS) 
SOD1 in Figure IB. Not all residues needed to be sampled to 
obtain the cumulative distribution- we found convergence to 
within « 1 kJ/mol after a sample size of about 40 residues (SI 
Fig. S8). This does not mean the mechanical profile obtained 
from 48 residues is representative of the stability in the corre- 
sponding regions of the protein: the mechanical work values 
were essentially uncorrelated for amino acids about 3 residues 
apart. 

Comparison of the cumulative work distributions shows 
that the ALS-associated mutant Cu,Zn(SS) A4V is slightly 
more mechanically malleable than Cu,Zn(SS) WT, particu- 
larly in the weaker regions of the protein (SI Fig. S13). How- 
ever the mechanical weakening due to mutation is substan- 
tially larger in the E,E(SS) state, and largest in the E,E(SH) 
state (green curves in SI Fig. S13). Experimental measure- 
ments of melting temperature also have shown increased sus- 
ceptibility to mutation in the E,E(SH) state for A4V [29]. 
We also note from SI Figure S13 that disulfide reduction in 
the WT apo state, E,E(SS)— >>E,E(SH), actually increases the 
rigidity of the native basin [23], even though the net ther- 
modynamic stability is decreased, primarily due to increased 
unfolded entropy [30]. 



Lack of post-translational modifications mechanically desta- 
bilizes SOD1, however the truncation mutant G127X stabi- 
lizes the apo, disulfide-reduced protein.. The cumulative dis- 
tribution of E,E(SH) lies to the left of that for Cu,Zn(SS) 
SOD1 in Figure IB for all rank-ordered work values, illustrat- 
ing that that variant is more malleable and thus more suscepti- 
ble to perturbing forces that might induce the conformational 
changes accompanying misfolding. Likewise, Cu,Zn(SH) is 
destabilized with respect to Cu,Zn(SS) in Figure 1C, and the 
other PTM variants in SI Figure S14 show that lack of PTMs 
reduces the mechanical rigidity. 

On the other hand, the E,E G127X truncation mutant, 
which lacks both metals, a disulfide bond, and part of the 
sequence, shows increased mechanical stability over E,E(SH) 
WT (p = 9e-7, Fig. IB). Comparing the profiles, E,E(SH) 
is most commonly the weakest, Cu,Zn(SS) is most commonly 
the strongest, and G127X is most commonly in the middle 
(Fig. IB Inset b). Apparently the C-terminal region of 
the protein mechanically stresses the remainder of the pro- 
tein when PTMs are absent, reducing its mechanical stabil- 
ity. However, a met ast able holo variant of G127X contain- 
ing metals in their putative positions is less mechanically sta- 
ble than holo WT (p = 6.2e-8, Fig. 1C). In the full-length 
holo protein, native stabilizing interactions are more apt to 
be minimally frustrated, and thus C-terminal truncation in- 
duces softening of the native structure. The metallation of 
G127X only marginally stabilizes it (p — 6.7e-3, Fig. 1C in- 
set a) , while metallation of WT protein results in substantial 
mechanical stabilization (SI Fig. S14). G127X lacks C146, so 
Cu,Zn(SH) WT may be a more appropriate comparison than 
Cu,Zn(SS) WT. Cu,Zn(SH) SOD1 is less stable than either 
Cu,Zn G127X or E,E G127X (Fig. 1C), however Cu,Zn(SH) 
contains a protononated Cysteine CI 46, which is destabiliz- 
ing, and not present in G127X. 

To deconvolute the effects of protonated Cysteines, we ex- 
amined a set of 4 serine mutant proteins: Cu,Zn C57S/C146S 
1-153 (full length), E,E C57S/C146S 1-153, Cu,Zn C57S 1- 
127 (truncated at residue 127), and E,E C57S 1-127. For this 
system of proteins, Cu,Zn C57S/C146S 1-153 is clearly the 
most stable over most of the range of work values (Fig. ID), 
and Cu,Zn C57S 1-127 is destabilized with respect to it. E,E 



C57S 1-127 is further destabilized with respect to its holo 
form, but is more stable than the full length apo form E,E 
C57S/C146S 1-153. This demonstrates that C-terminal trun- 
cation stabilizes the apo form but destabilizes the holo form. 
This conclusion is robust to changes in pulling distance (SI 
Fig. SI 5). 

Short length C-terminal truncation mechanically stabilizes 
the apo protein, while sufficiently long C-terminal truncation 
destabilizes it.. The mechanical properties of two additional 
truncated constructs, E,E 1-140 (WT sequence), and E,E 1- 
110, were assayed to investigate the crossover from increased 
to eventually reduced mechanical stability as the length of 
truncation is increased (Fig. 2A inset b). All truncation 
variants are missing the putative disulfide bond. Figure 2 A 
shows cumulative distributions for the above variants along 
with E,E(SH) SOD and E,E G127X. The truncation E,E 1- 
140 mechanically stabilizes E,E(SH) SOD1 (p = le-3), and 
G127X is further stabilized with respect to 1-140 (p = 9e-4). 
Comparison of the triplets of rank-ordered work values cor- 
responding to E,E(SH), E,E 1-140, and E,E G127X supports 
this conclusion (Fig. 2A inset a). We have omitted the 3 
least stable work values for all variants in calculating statis- 
tical significance- these are outliers for E,E(SH) that would 
dominate the result towards the conclusion that we have ar- 
rived at without their inclusion. Variant E,E 1-110 has sig- 
nificantly compromised mechanical rigidity and may not be 
thermodynamically stable- the mechanical assay only probes 
malleability of the native basin. 

Simulated fluctuations correlate with experimental spectral 
density functions for apo SOD1.. We performed 20ns equilib- 
rium simulations in explicit solvent (SPC water model) for 
Cu,Zn(SS) and E,E(SS) SOD1. Native-basin RMSF values 
show significant scatter between explicit and implicit solvent 
models compared to mechanical work values (SI Fig. S5), 
so for analysis of equilibrium fluctuations, explicit solvent is 
used. Figure 2B plots the resulting equilibrium RMSF for 
Cu,Zn(SS) and E,E(SS) SOD1; SI Figure S9 also gives the 
solvent-accessible surface area of backbone amide Nitrogens 
(SASAjv) for the same equilibrium trajectory. The main effect 
of metal loss is to induce solvent exposed, disordered, and dy- 
namic Zn-binding and electrostatic loops (loops IV and VII). 
A moderate increase in dynamics is observed for loop VI as 
well. The preferential increase in dynamics of the ZBL and 
ESL upon metal loss is consistent with experimental measure- 
ments of the ratio of spectral density functions J(luh)/ J(oon)- 
a measure of dynamics fast compared to the tumbling rate 
(correlation coefficient r = 0.78). The experimental measure- 
ments of spectral density [31] are obtained for a monomeric 
E,E(SS) SOD1 mutant F50E/G51E/E133Q [32]. 

Dynamic fluctuations in C-terminal truncation mutants re- 
veal native frustration in apo SOD.. Because G127X is more 
mechanically stable than E,E(SH) SOD1, native basin fluctu- 
ations were investigated to see if the extra stability of G127X 
was recapitulated in equilibrium dynamics. Figure 2C shows 
that indeed the RMSF are substantially enhanced in E,E(SH) 
relative to G127X, in particular in the ZBL and ESL (loops IV 
and VII). This is true even though /3-strand 8, N-terminal to 
the ESL, constrains E,E(SH) and is absent in G127X. Though 
more dynamic and mechanically malleable, E,E(SH) is not 
more solvent-exposed (by SASAat) than G127X, indicating a 
collapsed, dynamic globule with non-native interactions (SI 
Fig. S10). As well, E,E(SH) is more dynamic than E,E(SS) 
SOD1 (Fig. 2C), but less solvent exposed than E,E(SS), par- 
ticularly in the ZBL and ESL (SI Fig. S10). Collapse and non- 
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native interactions are not hindering the dynamics of the ZBL 
and ESL; snapshots from simulations for E,E(SH) WT and 
G127X are shown in Figure 2D. Similar condensation phenom- 
ena are also observed in prion protein [33] and NHERF1 [34]. 
In a large scale study of 253 proteins across several fold fami- 
lies [35], RMSF and SASA showed poor correlation (r ~ 0.35 
for backbone carbons). 

It is intriguing that the additional constraint of structur- 
ing f38 in E,E(SH) SOD1 results in enhanced rather than sup- 
pressed disorder in the ZBL. One potential explanation is that 
because there is more residues in the ESL present in E,E(SH) 
than in G127X, the ESL forces the ZBL to be more expanded 
by steric repulsion, and thus more dynamic, i.e. a "polymer 
brush" effect. Another possible explanation is that the order 
induced by structuring f38 in E,E(SH) SOD1 induces frustra- 
tion and consequent strain elsewhere in the protein, result- 
ing in induced disorder in the ZBL. We differentiated these 
two scenarios by applying native constraints between the C a 
atoms in the ESL and the rest of protein, but excluding con- 
tacts between the ESL and the ZBL: harmonic springs were 
applied to all pairs of C& atoms within 4A in the native holo 
structure that involved contacts either within residues 133- 
153, or between residues 133-153 and either residues 1-40 or 
90-153. This procedure further constrains the ESL and f38 
to be natively structured, removing any polymer brush effect, 
but enhancing any native strain. Consistent with a model 
involving native frustration, the ZBL loop IV becomes more 
disordered in E,E(SH) SOD1 upon implementing ESL//38 na- 
tive constraints (Fig. 2C). 



"Frustratometer" results and potential energy changes sup- 
port increased frustration in apo WT SOD1 with respect 
to apo ALS-associated mutants.. Typically frustrated con- 
tacts, according to the "frustratometer" method developed 
by Wolynes and colleagues [5], were found by averaging 50 
snapshots from an equilibrium ensemble for WT SOD1, and 
for each of 22 ALS-associated SOD1 mutants (SI Table S3). 
Both E,E(SS), and Cu,Zn(SS) states were analyzed. Results 
were averaged over the 22 mutants to yield the mean num- 
ber of frustrated contacts at a given residue position, for the 
"average" mutant. Taking the difference of this quantity with 
that for WT SOD1 gives the mean change in frustration upon 
mutation, which increased for holo SOD1 by about 5 total 
contacts, but decreased for apo SOD1 by about 22 total con- 
tacts (Fig. 3A). This result again supports a frustrated apo 
state in SOD1. Direct computation shows on average about 
38 more highly- frustrated contacts in E,E(SS) SOD1 than in 
Cu,Zn(SS) SOD1 (SI Fig. S17). 

For the same set of mutants, we found that the ensemble- 
averaged total potential energy in the native state increased 
upon mutation for the Cu,Zn(SS) protein, but decreased upon 
mutation in the E,E(SS) protein. This is consistent with the 
above frustratometer results. The "time-resolved" change in 
potential energy AU upon in silico mutation for a representa- 
tive mutant (G37R) is shown in Figure 4A. The distribution 
of potential energy changes for the 22 ALS-associated mutants 
listed in SI Table S3 is shown in Figure 4B, which shows po- 
tential energy "cost" upon mutation for Cu,Zn(SS) mutants, 
but a significant shift towards negative stabilizing values for 
E,E(SS) mutants. The same conclusion is obtained from 30 
non-ALS alanine mutants (SI Fig. S18). Figure 4C shows 
that the net effect of mutation on the potential energy is to 
increase it on average in the holo state, but to decrease it on 
average in the apo state, indicating stabilization. Moreover, 
all ALS mutants facilitate metal release [23]; the net effect of 
mutation on the Cu and Zn binding free energies is to decrease 



both of them. Thus mutations relieve stress in the apo state, 
while inducing loss of metal binding function. It thus appears 
that apo SOD1 has evolved to have high affinity for metals at 
the expense of native stability and increased frustration. 
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Fig. 3. (A) Frustrated contacts (in red) and unfrustrated contacts (in green) for 
E,E(SS) WT SOD1, (B) Same contacts as (A) for the average over 22 ALS E,E(SS) 
mutants (see text). (C) The mean number of frustrated contacts within a sphere 
of radius 5A centered on each C a atom, is found as a function of residue index. 
Ensemble averages are taken from 50 snapshots in an equilibrium simulation. This is 
done for both the Cu,Zn(SS) state and the E,E(SS) state, for both the WT sequence, 
and for 22 mutant sequences. The 22 mutant sequences are averaged to obtain the 
ensemble and mutant averaged number of contacts as a function of residue index '. 
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Plotted is the difference between (n HF (z)) MUT and the corre- 



sponding numbers for the WT sequence n^ T (i) . A positive number would indicate 
an increase in frustration upon mutation. Holo state is shown in blue and has an 
average of +5 contacts; apo state is shown in red and has an average of -22 contacts. 
(B) Interaction free energy between a residue's side chain and the Cu ion, plotted as a 
function of the E,E(SS) ensemble-averaged number of highly-frustrated contacts that 
residue has (r=-0.79, p=8e-21). (C) Same as in Panel B but for the Zn ion (r=-0.81, 
p=2e-22). 
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Fig. 4. (A) Change in potential energy AU(t) as a function of in silico time, 
before and after implementing the mutation G37R. (B) Distribution of the asymptotic 
potential energy change AU(oo) for 22 ALS mutants (see text). (C) Mean poten- 
tial energy change averaged over mutants, for both the holo state and the apo state, 
along with the mean difference, WT minus mutants, in both Cu and Zn binding free 
energy. 



Frustrating residues in the apo state are allosteric effectors, 
and positively modulate metal affinity in proportion to their 
frustration. To test the extent to which the residues facilitat- 
ing metal binding are frustrated, we have calculated the inter- 
action free energy of each WT residue with both Cu and Zn, 
by considering thermodynamic cycles [3] involving metallation 
and residue "insertion" from a glycine at the corresponding 
position (e.g. G4A). The interaction free energy Gint between 
each residue side-chain and either Zn or Cu is given (here 
specifically for residue Ala 4 with Zn) by 

G int = AG (G^A, O^Zn) - AG (G^A, O) - AG (G, O^Zn) 
= AG (A, O^Zn) - AG (G, O^Zn) [ 1 ] 

where AG (A, O— >Zn) is the free energy change of Zn insertion 
when alanine is present at position 4, and AG (G, O— >-Zn) is 
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the free energy change of Zn insertion when glycine (no side 
chain) is present at position 4. The other residues, and Cu 
interactions, are handled analogously (see SI Methods). 

Figures 3B,C plot the above interaction free energy of a 
residue with Cu or Zn, vs. the E,E(SS) ensemble-averaged 
number of highly-frustrated contacts that residue has. Data 
are obtained for 90 glycine mutants listed in SI Table S3, for 
residues that had at least one highly-frustrated contact (see 
SI). The values of Gi n t for all 90 mutants listed in SI Table S3 
are negative, indicating cooperative interactions, wherein the 
WT residue facilitates binding of the metal. Each of these 
residues can be thought of as allosteric effectors, positively 
modulating affinity for either metal. For both Cu and Zn, the 
larger the degree of frustration in the apo state, the larger the 
role that residue has in facilitating metal binding. No such 
trend is seen for Cu,Zn(SS) SOD1 (SI Fig. S18). This result 
strongly supports a model of the apo state as an allosteric 
intermediate designed for high metal binding affinity at the 
expense of structural stability. 




Fig. 5. Panel (A): Residues color-coded by interaction energy with the Cu ion 
(depicted as a cyan sphere). The extent of interaction is strongest in magnitude for 
red colored residues and decreases to blue. Panel (B): same as (A) for the Zn ion 
(depicted as a grey sphere). Panel (C): Interaction energy with Cu correlates with 
the distance of the residue from the Cu ion; residues in close proximity more strongly 
interact. Panel (D): Interaction energy with Zn does not correlate with distance of 
the residue to the Zn ion, indicating non-local allosteric effects. 

Finally, we test the distance-dependence of the interaction 
free energy with the metals, for the set of 24 ALS-associated 
mutants in SI Table S3. In the case of Cu, the allosteric reg- 
ulation for metal affinity is significantly correlated with the 
proximity of the residue to the Cu binding site (Fig. 5A,C). 
Interestingly, the allosteric regulation for Zn binding is uncor- 
rected with distance to the Zn binding site and thus nonlo- 
cal. This finding is consistent with experimental results that 
Zn binding is concomitant with large structural change (par- 
tial folding) of the protein [36]. Long-range coupling to the 
Zn-binding region has also been observed in G93A SOD1 [37]. 
The same conclusion is obtained for the 90 frustrated residues 
given in SI Table S3 (SI Fig. S19). 



Discussion 

We have found here a connection between the allosteric design 
of residues in premature superoxide dismutase towards high 
metal-binding affinity, and the consequent frustration in the 
apo state of the protein. A variety of results supported this 
conclusion. Mechanical profiles were obtained from in silico 
AFM assays with variable tether positions; in this context it 
was found that the C-terminal truncation mutant G127X had 
higher mechanical rigidity than E,E(SH) SOD1, implying the 
release of internal stresses upon removal of part of the protein. 
Large truncation lengths eventually destabilized the protein. 

The higher malleability of E,E(SH) SOD1 over G127X is 
recapitulated by larger equilibrium dynamical fluctuations in 
the native basin. Constraining loop VII (the ESL) to be na- 
tively structured only increases fluctuations in loop IV (the 
ZBL), ruling out polymer brush effects and supporting the 
native stress hypothesis. 

Implementing Wolynes's "frustratometer" method [5] 
shows that, perhaps surprisingly, more frustration is present 
on average in the WT apo state than is present for apo mu- 
tants. For the holo state the situation is reversed however, 
which is consistent with the notion that mutants facilitate 
metal release. In fact every ALS-associated mutant we have 
studied lowered the affinity for both Cu and Zn [23] . We have 
found here that while these mutants raise the potential energy 
of the holo state, they tend to lower the potential energy of the 
apo state and thus stabilize it, consistent with frustratome- 
ter results. A general comparative analysis of the decrease in 
potential energy and frustration for apo ALS-associated mu- 
tants, along with the results from their individual mechanical 
scans which generally show weakening with respect to local 
perturbations, is an interesting topic for future work. SI Fig- 
ure S20 compares the relevant quantities for the ALS mutants 
A4V and G127X. 

Residues in apo SOD1 can be thought of as allosteric ef- 
fectors for metal binding. By considering the cooperativity 
in thermodynamic cycles involving mutation to glycine and 
metal release, we quantified the interaction free energies be- 
tween residues in the protein and either Cu or Zn. All inter- 
action energies were negative, indicating positive modulation 
of metal affinity, Moreover we found that function frustrates 
stability: the stronger the interaction energy, the more frus- 
trated the residue. For Cu, the strength of the interaction sig- 
nificantly correlates with proximity to the binding site. For 
Zn however, there is no correlation with proximity, indicat- 
ing a non-local allosteric mechanism involving propagation of 
stress-release throughout the protein, and consistent with the 
large structural changes accompanying Zn binding. 

The above evidence points towards a paradigm wherein 
sequence evolution towards high metal affinity results in a 
trade-off for significant native frustration in the apo state of 
SOD1. A similar conclusion has been reached from studies 
of a SOD1 variant with Zn-coordinating ligands H63, H71, 
H80, and D83 mutated to S, which in the apo form is stabi- 
lized with respect to E,E WT SOD1 [38]. In this context, the 
C-terminal truncation in G127X can be seen as an allosteric 
inhibition mechanism to Zn binding, in that frustration is re- 
lieved in the apo native state, but Zn-binding function is lost. 
A similar scenario is observed for select mutants of subtil- 
isin, a serine protease whose function is regulated by Ca 2+ 
binding. In this protein, the mutation M50F preferentially 
stabilizes apo subtilisin relative to the holo form, while weak- 
ening calcium binding and promoting inactivation in the holo 
form [39]. 

Native frustration in apo SOD1 as a result of allosteric 
cooperativity in metal binding has potential consequences for 
the misfolding of SOD1. The premature protein, or a pro- 
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tein that perhaps due to an external agent has lost its metals, 
would show decreased thermal stability relative to one that 
had not undergone sequence evolution for high metal affinity. 
In this sense, the tight binding of Zn and Cu essential for en- 
zymatic function of the mature protein as an antioxidant puts 
the premature form in additional peril for misfolding. 

Materials and Methods 

A full description of the methods is given in the SI. Missense and truncation mutants 
of S0D1, both ALS-associated and rationally designed, were equilibrated and used for 
mechanical force, dynamic fluctuation, frustratometer, potential energy, and WHAM 
metal affinity assays. Rationally-designed truncation and missense mutants studied 
here include C57S/C146S, C57S 1-127, and WT sequences 1-110 and 1-140. Frustra- 
tion and metal-binding allostery assays used either 22 and 24 ALS-associated mutants 



respectively, or 90 glycine mutants (SI Table S3). Mechanical profiles are obtained 
after 20ns pre-equilibration from steered MD simulations (tether speed 2.5mm/s) 
in GBSA solvent with OPLS-AA/L force field parameters. Robustness checks are 
shown in SI Figure S15. Monte Carlo methods yield the statistical signifance (error 
« 2.7kJ/mol, SI Fig. S21). Fluctuation analysis used SPC explicit solvent. Metal 
binding free energies are found from WHAM including post-relaxation and validation 
by thermodynamic cycles. Frustration calculations include protein conformations and 
protein- protein contacts only; i.e. metals are implicit in determining protein confor- 
mation but metal-protein interactions are not explicitly included. Frustrated contacts 
were calculated using the frustratometer server http://frustratometer.tk. 
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Sl-text 

Work-extension profiles provide a measure of local mechan- 
ical stability, and have distance-dependent stiffness moduli. 

We performed pulling simulations on residues taken from the 
mid-points of the protein sequences of superoxide dismutase 
predicted to be either weak (referred to here as candidate epi- 
topes) or strong (candidate anti-epitopes) thermodynamically 
(see Fig. SI ). 
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Fig. SI. Ribbon representation of monomeric SOD1 structure with Cu and Zn 
metals shown as orange and gray spheres respectively Candidate misfolding-specific 
epitopes as predicted by the algorithm of Guest, Cashman and Plotkin [1] are colored 
red, and their residue numbers are indicated. In the Table - Epitopes, anti-epitopes, 
pulling residues, and resulting work values for Cu.Zn (SS) WT SOD1. 



A tethering point was placed on the C« atom at the center 
residue of a candidate epitope or anti-epitope (see Methods 
below), and another tethering point was placed on the C a 
atom closest to the center of mass of the protein (histidine 
46). A plot of the pulling force vs extension for a loading 
rate of 2.5 x 10 _3 m/s is shown in Figure 1(A) inset (a) of 
the main text, for the residues centered at the midpoints of 
the first candidate epitopes/anti-epitope (see Methods). The 
first (weak stability) epitope contains residues 5 — 15 so the 
C a atom of residue 10 is taken as a tethering point. The 
first anti-epitope predicted to be thermo-mechanically stable 
consists of residues 16 — 20 for which residue 17 is chosen as 
representative (see Methods). 

The forces fluctuate stochastically, however the work to 
pull to a distance x, being the integral of the force W(x) = 
f^F(x / )dx / results in a smooth curve (Fig. 1(A)). The work 
generally does not have a slope of zero as x — » on the length 
scale of ~ lA, because of an initial small-distance nonlinear 
response corresponding to a steep rise in force within ~ 0.1 
A. That is, a force response function that appeared to con- 
verge to a non-zero force as x — >• would correspond to a work 
function with linear behavior as x — ► 0. 

We interpret the initial steep rise in force as being due the 
collective effect of numerous strong bonds which seek to pre- 
serve the native structure. As distance is increased, the num- 
ber of restoring interactions, and/or the magnitude of these 
interactions, is decreased. Thus the effective modulus of the 
system as calculated by 2W(x)/x 2 is distance dependent, and 
softens with increasing distance (see Methods). A plot of the 
effective modulus for short distances < lA is given in inset 
(b) of Figure 1(A). 

Previous measurements of force vs. extension or force vs. 
time have shown that the force converges to non-zero values 
at short distances or times. This is the case for ligand binding 
simulations [2] where the force converged to ~ 50 — 100 pN for 
the shortest times, and in protein unfolding simulations [3,4] 
where the force converged to ~ 400 — 700 pN at the shortest 
distances. These observations are consistent with the steep 
initial rises in the force and corresponding distance-dependent 
moduli that we have resolved in the present study. 

It was also observed that pulling on a given residue re- 
sulted in large fluctuations in remote regions of the protein. 
For example pulling on residue 10 disordered a-helix 2 con- 
taining residues 133-138, and pulling on residue 17 disordered 
a- helix 1 containing residues 55-61 (Figure 1(A) inset figures). 
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Local mechanical strain, at least by pulling a residue, induces 
a non-trivial stress profile that results in induced disorder at 
remote regions in the protein. Such induced disorder may 
be a key ingredient in the propagation of misfolded SOD1 
conformations in ALS, as well as other misfolding diseases 
propagated by template directed misfolding. 

The origin of large stiffness moduli at very short (sub- 
Angstrom) distances is likely due to side chain docking. What 
is the origin of this highly local mechanical rigidity that gives 
rise to steep initial increases in force? From our pulling sim- 
ulations, it was observed that the forces required to extend 
well-structured parts of the protein were much larger than 
the forces required for parts of the protein that were poorly 
structured or disordered. For example, in the range of exten- 
sions from ^ 0.1 — 0.2A, the force on residue 17 in /3-strand 2 
of SOD1 was ~ 83piV, and the force on residue 10 in turn 1 
was « 67 'pN, while the force on residue 60 in the disordered 
Zn-binding loop of Zn-depleted SOD1 was ~ 52pN. As an ex- 
ample of a residue that should lack any side-chain docking, the 
force on residue 133 at the disordered, non-native C-terminus 
of the C-terminal truncation mutant E,E G127X is « 41pN. 
We thus investigated the phenomenon of short-range mechan- 
ical rigidity by calculating the components of the interaction 
energy as a tethered residue was pulled. 

Figure S2(a) depicts a schematic of the simulation proto- 
col, and Figure S2(b) shows the results. From these potential 
energy calculations, we see that the initial steep rise in force is 
due to the loss of short-range van der Waals and electrostatic 
interactions during the course of unfolding. The decrease in 
interactions is mainly between side chains (SCs) rather than 
backbone (Fig. S2(b) panel J): roughly 3/4 of the change 
in energy arises from SC-SC interactions. This effect thus 
appears to be due to the many-body interactions stabilizing 
native structures through SC-SC docking, likely formed in the 
latest stages of folding. 

Utility of making cumulative distribution. In the manuscript, 
we have analysed the mechanical profiles mainly by construct- 
ing cumulative distributions of the work profiles, rather than 
the more common probability distribution measurements. 
The reason behind representing the work profiles in the form 
of cumulative distributions is that this representation gives the 
best way to differentiate between work profiles of two differ- 
ent variants of SOD1. The mechanical work profiles are too 
noisy to compare their relative stability from the sequence- 
resolved work profiles. Histograms of two work profiles also 
do not clearly differentiate between two different variants of 
SOD1 (see Figure S3). However, the cumulative distributions 
of the work profiles make them completely distinguishable and 
help to easily identify the relative order of stability among the 
variants. 



SOD1 and various modified SOD1 proteins, including mutant 
SOD1 (see Figure S13), de-metallated SOD1, and disulfide- 
reduced SOD1. 
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Fig. S2. (Panel (a)) Schematic representation of the method of calculation of in- 
teraction energy terms shown in panel (b). We have calculated the interaction energy 
terms of the atoms that are within 5 A radius from the C a atom of residue 10 and 
70 when residue 10 is pulled - the blue sphere indicates the C a of residue 46 which 
is the center of the protein, the small red sphere indicates the C a atom of residue 
10 which is the tethering point, and small green sphere indicates the C a atom of 
residue 70. The arrows show the direction of pulling. The larger, semi-transparent 
red and green spheres have radii of 5 A, and enclose the atoms within 5 A from 
the respective C a atoms of the protein. (Panel (b)) Various terms in the potential 
energy as a function of distance, when residue 10 is pulled. The energies for residue 
70 are investigated as a control. Analyzing individual terms in the potential energy 
elucidates the reason behind the initial sub-angstrom steep rise in the mechanical 
force. Figures A, B, and C plot the rise in short-range van der Waals energy, short- 
range component of the Coulomb energy, and total potential energy as a function of 
distance. These show a concurrent rise on the length scale of the sudden rise in the 
force in Figure 1(A) inset (a), main text. No such distance-sensitive change is seen for 
angle energies in residue 10 (figure D), or for any energies of a control residue (70) far 
from the pulling site (figures E,F,G). Decomposing the potential energy terms into 
backbone-backbone (H), backbone-sidechain (I), and sidechain-sidechain terms (J) 
shows that BB-BB interactions play no role, and about 3/4 of the total contribution 
arises from SC-SC interactions, indicating SC docking plays a dominant role in small 
RMSD mechanical stability. 



The mechanical profiles of Cu,Zn (SS) WT and E,E (SH) 
WT SOD1 are different, and are independent of the starting 
PDB structure used to construct them. We can take the value 
of the work needed to pull a particular residue out to 5A as 
a representation of the mechanical rigidity of that residue. 
This value can then be scanned across the protein sequence 
to obtain a mechanical profile or fingerprint for a particular 
SOD1 variant. Obtaining a work value for a given residue 
is computationally intensive however, so we take a subset of 
48 residues as a "sparse sampling" of the mechanical profile, 
in order to compare mechanical stability between SOD1 vari- 
ants. The specific residues chosen are given in the Methods 
section. Mechanical profiles may be compared between WT 
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Fig. S3. Panel A: Probability distribution of work values of Cu.Zn (SS) WT and 
E,E G127X. Panel B: Probability distribution of work values of E,E G127X and E,E 
(SH) WT SOD1. 
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We first ensured that the mechanical profile obtained for 
a given S0D1 variant was independent of the initial condi- 
tions used in the simulations, in particular for SOD1 variants 
that currently have no PDB structure such as E,E (SH) WT 
SOD1. Equilibrated structures were generated as described 
in the Methods section below, and used as initial conditions 
for pulling simulations to generate mechanical profiles. Inset 
A of Figure S4 shows a plot of the mechanical scan for Cu,Zn 
(SS) WT SOD1, and the main panel of Figure S4 shows the 
mechanical scan for E,E (SH) WT S0D1. 



ferent; in particular the combination of metal depletion and 
disulfide reduction reduces the overall mechanical stability of 
several regions of SOD1. We analyze this in more detail in 
the manuscript. 

An implicit solvent model is sufficiently accurate to obtain 
the mechanical profile. To test the accuracy of the general- 
ized Born surface area (GBSA) implicit solvent model in de- 
termining the mechanical profile, we have performed pulling 
simulations on SOD1 in explicit solvent, where waters interact 
through the SPC force field. 
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Fig. S4. The mechanical work profile is independent of the crystal/NMR structure 
used to generate the initial ensemble in a pulling simulation. The main panel shows 
the mechanical profile that results from 2 different constructions of the initial ensem- 
ble of E,E (SH) WT SOD1. In one construction we start from the solution structure 
of E,E (SS) WT SOD1, reduce the disulfide bond, and equilibrate the system before 
starting simulations. In another construction we start from the crystal structure of 
Cu.Zn (SS) WT SOD1, remove the metals and reduce the SS-bond, and then equili- 
brate. (Inset A) The mechanical profile obtained from 2 different crystal structures of 
Cu.Zn (SS) WT SOD1 (1HL5 and 2C9V), equilibrated for 20ns and then simulated 
as described in the Methods. Both models give the same mechanical profile to within 
about 2.7 kJ/mol. (Inset B) Work profiles for Cu.Zn (SS) WT and E,E (SH) WT 
SOD1 are seen to be significantly different, with E,E (SH) WT SOD1 generally having 
weakened mechanical susceptibility in various regions, but occasionally showing stiffer 
response in some locations. 



In each case, the protein was constructed from two dif- 
ferent initial models of the protein structure, using two dif- 
ferent PDB structures as starting points. We found that the 
mechanical profile of a particular SOD1 variant was nearly 
independent of how that variant was constructed, reinforc- 
ing the reliability of the mechanical scan. In inset A of Fig- 
ure S4, mechanical work profiles correspond to crystal struc- 
tures 1HL5 [5] and 2C9V [6] of Cu,Zn (SS) WT SOD1. In the 
main panel of Figure S4, mechanical work profiles correspond 
to E,E (SH) WT SOD1, obtained by modifying either the 
Cu,Zn (SS) WT crystal structure 1HL5, or the NMR struc- 
ture 1RK7 of E,E (SS) WT SOD1 [7]. We note that these 
PDB structures are equilibrated for 20 ns before any simula- 
tion measurements are taken. The mean error between SOD1 
variants, as determined by the z-test described in the Meth- 
ods section, is 2.7 kJ/mol. On the other hand, the standard 
deviation of all 48 of the work values themselves for a given 
variant (e.g. 1HL5) is 18.3 kJ/mol, which is a factor of about 
6.8 larger than the error. The mean error between variants 
used to construct the same initial condition indicates the level 
of accuracy of the simulations, so that differences in work pro- 
files (e.g. between mutant and WT) must generally be larger 
than this mean error to be significant. Inset B to Figure S4 
plots the mechanical profiles of Cu,Zn (SS) WT SOD1 and 
E,E (SH) WT SOD1. They are seen to be significantly dif- 



160 
140 



r=0.802 
P=1.2e-19 
0-8|<f Slope=0.83 

11 



1200- 



SlOO 



60 




20 40 60 80 100 120 140 160 

_JL_ _A ReS ' dUe lnde * ^ 

Fig. S5. The main panel shows that the work profiles agree between implicit and 
explicit solvent simulations. (Inset A) Scatter plot of the root mean square fluctu- 
ation (RMSF) for heavy atoms in the protein in both implicit and explicit solvents; 
r = 0.802, P r = 1.2e-19. Green solid line is the best fit line with a slope of 
0.83. Blue dashed line is the line of slope unity with y = x. Red dashed line is the 
median-fit line with an equal number of data points above and below it, and has a 
sloe of 1.08. (Inset B) Difference in the distributions of RMSF between the implicit 
and explicit solvent models. This shows an enhancement of small RMSF values and 
suppression of large RMSF values for the implicit solvent model. (Inset C) Scatter 
plot of work values obtained from implicit solvent vs explicit solvent models; these 
models correlate with r = 0.991. 



Inset A of Figure S5 plots the root mean squared fluc- 
tuations of heavy atoms, obtained from 20ns simulations for 
both implicit and explicit solvent models. The two systems 
have comparable thermal fluctuations, though the best fit line 
(green line in inset A of Fig. S5) has a slope less than unity, 
indicating somewhat larger fluctuations in the explicit sol- 
vent model. Interestingly however, the number of data points 
above and below the best fit line are 624 and 455 respectively, 
indicating that there are non-Gaussian fluctuations and out- 
liers between the two models. 

This skew in the data may be investigated through the 
distributions of RMSF, for both the implicit and explicit sol- 
vent models. These distributions are different. The difference 
in the distributions, Pimp — Pex P , as a function of RMSF, is 
plotted in inset B of Figure S5. This shows that the implicit 
solvent model overestimates small fluctuations, and underes- 
timates large fluctuations, as compared to the explicit solvent 
model. 

We can investigate what slope line would give equal num- 
bers of data points above and below it, as an additional mea- 
sure of the validity of the implicit solvent model. By this 
measure the implicit solvent model agrees much better with 
the explicit solvent data: the median fit line with equal num- 
bers of data above and below it has a slope of nearly unity 
(slopes 1.08). 

The imperfect correlation between implicit and explicit 
solvent fluctuations prompts a comparison of the work values 
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in implicit and explicit solvent. A scatter plot of work val- 
ues to pull the same residues to 5 A in implicit and explicit 
solvent models is shown in inset C of Figure S5. Interest- 
ingly, here we see a much stronger correlation for the values 
of mechanical work. The mechanical work values result from 
a significant non-equilibrium perturbation compared to the 
local fluctuations in the native basin, the latter of which are 
apparently more sensitive to solvent conditions. A mechan- 
ical scan of 48 residues is shown in the main panel of Fig- 
ure S5. Here the implicit and explicit solvent models show 
good agreement: the standard deviation of the difference in 
work profiles is about ~ 2.5 kJ/mol which is less than the 
mean error of ~ 2.7 kJ/mol obtained from using different crys- 
tal structures to set up the same initial conditions. One caveat 
is that the implicit solvent work values tend to be slightly 
higher than those of the explicit solvent: the mean of AW 
is about 1.1 kJ/mol, so that a z-test indicates the data AW 
arise from a gaussian distribution of mean zero only when the 
standard deviation of the gaussian distribution is 4 kJ/mol 
or larger. Overall, the data indicate that the implicit sol- 
vent model yields mechanical profiles that are as reliable as 
those obtained from much more time-intensive explicit SPC 
solvent simulations, but perhaps with modestly larger values 
1 kJ/mol) of work. 



A Go model does not adequately capture the mechanical 
profile to sufficient accuracy. Since the implicit-solvent model 
captured the mechanical profile to good accuracy, we pursued 
a further step in simplifying the energy function, to see if a Go 
model [8] would succeed in reproducing the mechanical pro- 
file. The Go model recipe [9] (see Methods) takes heavy atoms 
within 2.5 A, and applies native contacts to them with an LJ- 
like 6-12 potential. The Go recipe also attributes energy to 
native- like dihedral angles. The overall energy scale of all in- 
teractions is given by 1 kJ/mol times the number of atoms in 
the system. This recipe is intended to approximately account 
for all native stabilizing interactions as well as solvation free 
energy. 

Figure S6 plots the work profiles of Cu,Zn (SS) WT SOD1, 
for both an all-atom implicit-solvent model and an all-atom 
Go model. Perhaps surprisingly, the default energy scale in 
the Go model, 1 kJ/mol times the number of atoms, cap- 
tures the overall energy scale of the work profile quite well: 
both energy functions resulted in variation of the work from 
about 40 kJ/mol to about 120 kJ/mol. However, from a blind 
comparison of the cumulative distributions for the implicit 
solvent and Go models, one would conclude they were dif- 
ferent proteins, so the distribution of work values is signifi- 
cantly different. One can adjust the overall energy scale in 
the Go model to better capture the mechanical work distribu- 
tion, but the optimal value of the energy scale is not known a 
priori. Moreover, the correlation between the implicit-solvent 
and Go models, r = 0.377, is not strong (Fig. S6 inset A). 
Increasing the overall Go energy scale by a factor of 1.1 to 
improve the comparison of the cumulative distributions (in- 
set C of Fig. S6) does not improve the correlation between 
work values: r = 0.385, P — .007. The green line in inset 
A of Figure S6 indicates the best linear fit between the Go 
and implicit solvent models. The most significant outlier on 
the scatter plot is residue 141, a glycine (circled data point in 
Fig. S6 inset A), which also can be seen to have the largest 
discrepancy in the mechanical profile. It has the largest work 
in the implicit-solvent model, and one of the smallest in the 
Go model. Excluding this residue increases the correlation be- 
tween the two models to 0.621 (blue line in inset A). Why is 
it so anomalous? The residue resides in the so called electro- 



static loop, which is enriched in charged and polar residues, 
and coulombic energies are not explicitly treated in the Go 
model, which only contains 6-12 van der Waals-like interac- 
tions. 
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Fig. S6. Work profiles of Cu,Zn (SS) WT SOD1 obtained from an implicit sol- 
vent model, and a Go model. Both models are all-atom. (Inset A) Scatter plot of 
the work values obtained from both models. The green line is the best fit line to 
the data, which has a correlation that is weak (r = 0.377) but statistically sig- 
nificant (P = 0.008). Note that the energy scales range from about 40kJ/mol 
to 120kJ/mol for both models. Omitting one amino acid in the electrostatic loop, 
residue G141 (green circled data point in the lower right of panel A), increases the 
correlation to 0.62 and the significance to 3e-6 (blue line). Either with or without 
residue 141, the slope of the line is less than unity however, indicating that stabiliz- 
ing energetics are missing in the Go model. (Inset B) Distribution of the electrostatic 
potential energy within a sphere of radius 5A centered at the Cct atom, for all 
residues in the monomeric protein. Residue 141 has one of the largest contributions 
of electrostatic energy, which explains why its work value in the implicit solvent model 
was much higher than that in the Go model, which does not explicitly account for 
electrostatics. (Inset C) Cumulative distributions of the work values obtained from 
the implicit solvent (blue) and Go (red) models. The mean work difference in the 
cumulative distributions between the two models is ~ 7 kJ/mol. The green cumu- 
lative distribution in inset C corresponds to a Go model that has been reweighted to 
have contact and dihedral energies that are 1.1 X as strong. This shifts the work 
distribution to larger values, but the values themselves still do not correlate well with 
those in the implicit solvent model: r = 0.385, P = 0.007. 

We thus investigated the electrostatic component of the 
energy within a sphere of radius 5 A, centered at the Ca atom 
for every residue, to construct the histogram in inset B of Fig- 
ure S6. The energies plotted are the mean values of the en- 
ergy from an equilibrium simulation at 300K. The histogram 
of electrostatic energy for glycine 141 is also plotted. Indeed, 
residue 141 has one of the largest electrostatic contributions 
to its energy. This is impressive because of the small size, 
apolarity, and neutrality of the residue. Electrostatic contri- 
butions to protein stability, for example due to ion pairs or 
partial charges in either close proximity or in low dielectric 
environment, may be poorly accounted for in Go models. 

The mechanical profile accurately reflects the free energy 
profile of the protein. The work to pull a given Ca atom to 
5A is a non-equilibrium measurement of mechanical stiffness, 
and one can ask whether it accurately represents the ther- 
modynamic stability of that region. To address this ques- 
tion, we obtained the free energy to separate each of the 48 
Ca atoms used in the mechanical pulling simulations by 5A. 
The procedure for obtaining the free energy is described in 
the Methods section. Figure S7 plots the work values for 
all 48 residues used in the pulling simulations of Cu,Zn (SS) 
WT SOD1, vs. the free energy values for the correspond- 
ing residues as obtained from the weighted histogram analysis 
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method (WHAM). The correlation coefficient is 0.96, indicat- 
ing that the relative mechanical rigidity can be used to predict 
the relative thermodynamic stability. 
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Fig. S7. (Red squares) Work to pull a given residue to 5A vs the free energy 
change for a fluctuation to separate that residue by 5A, for Cu,Zn (SS) WT SOD1. 
Free energies on the abscissa are obtained from umbrella sampling and the weighted 
histogram analysis method (WHAM). Both work values and free energies, for the red 
squares, are obtained using a slow pulling speed of 2.5 X 10 — 3 m/s. All 48 residues 
used in the pulling simulations are shown. The correlation between work and free 
energy values is very strong, r = 0.96. For the pulling rates used in our study, the 
work values are about 1.3 X as large as the free energies, and the slope of the best 
fit line is 1.35. Blue squares: WHAM-derived free energy changes, as obtained from 
pulling simulations at two different pulling rates: The abscissa values give the free 
energy for a pulling rate of 2.5 X 10 — 3 m/s as above, the ordinate values have a 
pulling rate of 10 m/s. In addition to a near perfect correlation, the slope defined 
by the best fit line to the two sets of data is nearly unity (purple line), and the data 
is well-fit by the line y = x (cyan line) indicating that the free energy values are 
independent of the pulling speed used to obtain the initial data, and thus have been 
reliably determined. 



The work values are higher than the free energy values 
however, by a factor of about (W/F) « 1.3. The average 
mechanical work, as a non-equilibrium measurement, always 
exceeds the free energy change that would be due to rare equi- 
librium fluctuations. 

Since a faster pulling rate results in more deformation of 
the protein, different pulling rates can in principle result in 
initial conditions that, after umbrella sampling and WHAM, 
give different free energy profiles. We checked this by per- 
forming WHAM calculations at two different pulling speeds, 
2.5 x 10 _3 m/s and 10 m/s. The faster pulling speed results 
in more deformed protein structures that were used as initial 
conditions, however, each initial condition is always equili- 
brated for 10 ns in an umbrella potential as described in the 
Methods, which should remove most or all initial deformation 
effects. The free energy values thus obtained did not depend 
on the initial pulling speed used to generate the initial con- 
ditions for the WHAM protocol: they are within a factor of 
1.005. We thus used relatively fast pulling speeds of 10 m/s to 



obtain initial conditions used to calculate free energies from 
the WHAM method. 



The mechanical profile obtained from about 40 residues cap- 
tures the distribution of work values for a given SOD1 variant 
to sufficient accuracy. Inset A of Figure S8 shows the mechan- 
ical work profile to pull residues to 5 A, for every residue be- 
tween the N-terminus and residue 40. Residues from the orig- 
inal set of 48 are shown in red, others in blue. The values de- 
viate significantly residue to residue, with a correlation length 
less than the putative value of ^ 5 amino acids corresponding 
to the original data set. By calculating the residue-residue 
correlation function of the work (Wi • W^+ n ) and fitting to 
exp(— n/£ p ), the sequence correlation length is found to be 
« 2.83. The work values found from the mechanical scan us- 
ing the original sampling of 48 residues should thus not be 
interpreted as consensus values for the corresponding regions. 
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Fig. S8. (Inset A) Mechanical work profile to pull residues to 5A, for every 
amino acid between the N-terminus and residue 40. Residues from the original 
set of 48, discussed in the Methods subsection on residues used for mechanical 
scans, are shown in red, others in blue. Work values deviate significantly residue 
to residue, with a sequence correlation length £ p 2.83, as defined through 
(Wi • Wi+ n ) oc exp(— n/ip). The work values found from the mechanical 
scan using the original set of 48 residues should thus not be interpreted as consensus 
values for the corresponding regions. (Main panel) Cumulative distributions of the 
work values for Cu,Zn (SS) WT SOD1, constructed simply by rank ordering the work 
values and plotting the fraction of the total number vs the work. Each cumulative 
distribution corresponds to a specific number of data points (work values) as given 
in the legend, which are randomly selected from the total set of 153 data points. As 
the number of data points is increased, the cumulative distribution converges to that 
for the full data set. The average deviation of work values is plotted in inset B. By 
40 data points, the cumulative distribution has converged to within ~ 1.08 kJ/mol 
of the full 153-data point distribution. By 50 data points, it has converged to within 
W 0.88 kJ/mol of the full distribution. 



One can ask if the mechanical scan still has utility then. 
The main panel of Figure S8 shows a cumulative distribution 
of the work values for Cu,Zn (SS) WT SOD1, which is con- 
structed simply by rank ordering the work values and plotting 
the number vs the work, subsequently normalizing to unity. 
Several curves are shown: each curve corresponds to a given 
number of data points randomly selected from the total set of 
153 data points. As the number of data points is increased, 
the distribution converges to that containing the full data set. 
Inset B of Figure S8 plots the mean deviation in work val- 
ues between cumulative distributions. After about 40 data 
points, the cumulative distribution converges to within about 
1.08 kJ/mol of the full distribution using 153 data points. 
This deviation is smaller than the deviation in work values 
obtained from different starting PDB structures (see above 
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subsection "The mechanical profiles of Cu,Zn (SS) WT .." 
and Fig. S4). That is, if one constructs cumulative distri- 
butions from the work values in Figure S4, the mean work 
difference between cumulative distributions is « 1.2kJ/mol. 
The data set used for our analysis, given in the Methods sec- 
tion, contains 48 residues, and has a mean deviation from the 
153-residue cumulative distribution of about 0.9 kJ/mol. 

We found that many discrepancies of SOD1 variants from 
WT were difficult to disentangle from the work profile, but of- 
ten emerged naturally from the cumulative distribution. Me- 
chanical scans were thus used to construct the cumulative dis- 
tributions, which then allowed us to distinguish stabilizing 
energetics in various forms of mutant SOD1. For example, 
we find below that the cumulative distribution is different for 
differently metallated variants of SOD1 (Figure S14). 
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Fig. S9. Conformational disorder in the absence of metals, as obtained from equi- 
librium simulations, is largest in Zn-binding and electrostatic loops. Comparison of 
root mean squared fluctuations (RMSF) (A,B) and solvent accessible surface area of 
backbone amides (SASAjv) (C,D), for Cu.Zn (SS) WT and E,E (SS) WT SOD1. Re- 
sults are averaged over a 20 ns equilibrium simulation trajectory. Panels A and C plot 
the respective quantities for both proteins, and panels B and D plot the differences 
between E,E (SS) WT and Cu,Zn (SS) WT proteins, i.e. the change in RMSF or 
SASAjv upon loss of metals. Both quantities indicate substantial increase in dynamic 
disorder and loss of native structure in loops IV and VII, and to a lesser extent loop 
VI, upon loss of metals. 



Equilibrium dynamical results. Removing metals from WT 
protein results in substantially increased dynamical fluctu- 
ations (RMSF) in loops IV (ZBL) and VII (ESL), and to 
some extent loop VI (Figure S9). The RMSF of Cu,Zn(SS) 
WT and E,E(SS) WT, along with their difference, ARMSF, 
are shown in Figures S9A,B. Loops IV and VII also show 
increased solvent accessible surface area of backbone amide 
nitrogen (SASAjv) upon loss of metals (Fig. S9 panels C,D). 

The dynamic effects of fluctuations between E,E(SH) and 
G127X seen in Figure 2C of the main text are not recapit- 
ulated in backbone amide solvent exposure: loop IV of E,E 
(SH) WT SOD1 is not significantly more solvent exposed than 
loop IV in E,E G127X, and the N terminus of E,E G127X 
from strand f37 onwards is more solvent exposed than the cor- 
responding residues in E,E (SH) WT SOD1 (Fig. S10). 

The profile of potential energy can be obtained by finding 
the potential energy within a sphere of radius 8A, centered 
about a given Ca atom, and then varying the Ca index along 
the sequence. Consistent with the increased mechanical sta- 



bility of G127X with respect to E,E(SH) WT (Fig. IB), the 
potential energy profile of G127X is generally more stabilized 
than that of E,E(SH) WT (Fig. Sll). 
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Fig. S10. SASA of backbone amide nitrogen atoms, from equilibrium simulations 
of E,E G127X, E,E(SS) WT and E,E(SH) WT SOD1. 
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Fig. Sll. Interaction potential energy between all atoms within an 8 A sphere 
of the Co. atom of each residue, for E,E(SH) WT SOD1 and E,E G127X SOD1. 



Comparison of metal binding free energy, thermodynamic 
stability, and mechanical stability. The thermodynamic sta- 
bility of the E,E(SS) monomer for several SOD1 mutants has 
been obtained previously by Oliveberg and colleagues [10]. 
The free energy difference between WT and mutant free en- 
ergy of unfolding is plotted in Figure S12. As well, the change 
in average mechanical work values, from WT to mutant, is 
plotted in the same figure. The work values have been nor- 
malized by a factor of 1.35, the slope of Figure S7, to con- 
vert work values to effective free energies. Finally, the dif- 
ference in metal binding free energy, WT minus mutant, for 
both Cu and Zn, is plotted in the same figure. All free en- 
ergy changes are positive and between 1-25 kJ/mol. However, 
there is no significant correlation between any of these quan- 
tities, with the exception of the binding free energies of Cu 
and Zn (r = 0.91, p = 6.6e-4). 



Cumulative distributions of mechanical work for an ALS- 
associated mutant. Figure S13 shows the cumulative distri- 
butions of mechanical work for both WT SOD1 and A4V, for 
several different post-translatinal modification (PTM) states. 
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The Cu,Zn(SS) mechanical stabilities of WT and A4V are 
very similar, except for the weaker regions. In contrast, 
the E,E(SS) protein is substantially mechanically destabilized 
upon mutation, and the E,E(SH) protein is even further me- 
chanically destabilized upon mutation. Interestingly, over 
much of the distribution E,E(SH) WT is stabilized with re- 
spect to E,E(SS) WT. This stabilization is also observed by 
examining the change in potential energy upon disulfide re- 
duction [11]. 
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Fig. S12. (Black) Difference in free energy of Cu binding, WT minus mutant, 
(Red) Difference in free energy of Zn binding, WT minus mutant, (Green) Difference 
in average mechanical work values, WT minus mutant. The work values have been 
normalized by the slope of Figure S7 to convert them to effective free energies. (Blue) 
Difference in experimental thermodynamic free energy of unfolding [10], WT minus 
mutant. 
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Fig. S13. Cumulative distributions of work values for non-post-translationally 
modified (PTM) variants of WT SODl, i.e. those missing metals and/or disulfide 
bond. Cumulative distributions of A4V SODl in various PTM states are also shown. 



The change in mechanical stability upon mutation for the 
E,E(SH) state is in contrast to the change in mechanical sta- 
bility upon truncation for G127X (Figure IB, main text). The 
mutations are quite different- one replaces an Alanine with a 
Valine and the other more drastically removes 20 residues and 
mutates 6 C-terminal residues. 

Cumulative distributions of WT SODl PTM variants. Fig- 
ure S14 shows the cumulative distributions of Cu,Zn(SS) 
SODl along with several WT SODl PTM variants, including 
E,Zn(SS), Cu,E(SS), Cu,Zn(SH), and E,E(SS) WT. Compar- 



ing the mechanical destabilization of the premature variants 
with Cu,Zn(SS) SODl, we see that E,E(SS) SODl is the least 
mechanically stable. Comparing individual PTM states, the 
loss of Zn results in the largest destabilization, followed by 
disulfide reduction, followed by Cu depletion. 
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Fig. S14. Cumulative distribution of work values for various PTM states of WT 
SODl, as indicated in the legend. 



ALS-associated mutants. ALS-associated mutants considered 
in the main text were: A4V, W32S, G37R, L38V, G41D. 
G41S, H43R, H46R, H46R/H48Q, T54R, D76Y, H80R, G85R, 
D90A, G93A, G93C, I113T, G127X, D124V, D125H, S134N 
and L144F. 

Robustness tests of the results. In the simulations, mechan- 
ical fingerprints were determined at T — 300i^, and pulling 
speed v = 2.5mm/s; both numbers are comparable to those in 
AFM experimental assays. The pulling distance Ax = 5Awas 
taken to represent a significant perturbation from the native 
structure, but not so much so as to globally unfold the pro- 
tein. To test the robustness of our results to varying external 
conditions, we have taken one of our conclusions, namely that 
E,E C57S 1-127 SODl is mechanically stabilized with respect 
to E,E C57S/C146S 1-153 SODl (see Figure ID main text), 
and we have varied temperature, pulling speed, and pulling 
distance. Temperature was increased to 310i^, pulling speed 
was altered to v = 1.5mm/s and 4mm/s, and pulling distance 
was altered to Ax — 3,4,6 and 7A. The results are summa- 
rized in Figure S15. 

Panel A of Figure S15 is at the conditions used through- 
out our analysis, and reproduces Figure ID of the main 
text. Increasing the temperature to biological temperatures 
(T = 310K) rather than lab temperatures preserves the con- 
clusion and only slightly reduces the statistical significance. 
Decreasing or increasing the pulling speed to the values noted 
above tends to decrease or increase all work values by 2- 
3kJ/mol, but preserves the conclusion and leaves the signifi- 
cance nearly unchanged. Panels E-H of Figure S15 show in- 
creasing pulling distance Ax from 3Ato 7A. The conclusions 
are preserved for all distances, and the statistical significance 
of the conclusion increases with increasing distance pulled. 
Taken together, the above tests indicate that the conclusions 
in the manuscript are robust to varying external conditions. 

Work values projected on SODl structure. A mechanical scan 
was performed on Cu,Zn(SS) WT SODl for every Cot atom, 
and the work values were then projected on the native pro- 
tein structure in Figure S16 panel A. As mentioned in the 
text, the correlation length of the work values along the se- 
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quence is about 3, so the projection color-coded by work value 
is discontinuous. Applying to the work values a simple tent- 
shaped smoothing function that smoothes over 5 residues gives 
the plot of Figure S16 panel B. This shows that for example 
parts of loops IV and VII are mechanically more rigid than 
some beta sheets, indicating significant stabilization by the 
presence of the metals. 
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Fig. S15. Cumulative distribution of mechanical profiles showing E,E C57S 1-127 
SOD1 is mechanically stabilized with respect to E,E C57S/C146S 1-153 SOD1 (see 
Figure ID main text). Temperature T, speed v, extension Ax, and statistical sig- 
nificance of the result p as defined in the Methods below, are given for each panel. 
(Panel A) Assay done at conditions generally used in the main text. (Panel B): Same 
as panel A but at T = 310K. (Panels C,D): Same as panel A but at pulling speeds 
v = 1.5mm/s and 4mm/s. (Panels E-H): Same as panel A but at pulling distances 
3,4,6 and 7 A. Statistical significance monotonically increases with pulling distance. 



Frustratometer analysis of a non-ALS related alanine scan 

of SO Dl. Figure S17 shows the results frustratometer anal- 
ysis for both Cu,Zn(SS) SOD1, and E,E(SS) SOD1. The 
mean number of highly frustrated contacts for each variant 
is 2.2 and 2.45 respectively, indicating that the E,E(SS) state 
is more frustrated by about 38 contacts. In the main text we 
used frustatometer analysis [12] to show that more frustration 
was present in the E,E(SS) state of SOD1 than the Cu,Zn(SS) 



state, by showing that upon mutation, more highly frustrated 
contacts were introduced in the Cu,Zn(SS) state, but frustra- 
tion was removed by mutation in the E,E(SS) state. A set of 
22 ALS-associated mutations were taken. Here, we check that 
the conclusion is general- not restricted to ALS-associated 
mutation- by examining the effects of 30 X — »> A mutations. 
An alanine scan across the sequence is taken, wherein every 
5th residue in the SOD1 sequence is mutated to alanine if (1) 
it is not an alanine in the WT sequence and (2) the mutation 
to alanine does not lead to an ALS-associated mutant. When 
we encountered alanine in the WT sequence (residue 55, 60, 
95, 140 and 145), we took either the previous or the next 
residue for our analysis as indicated below. As well, D90A is 
an ALS-associated mutation, so for this case we chose residue 
91, i.e. K91A. 




Fig. S16. (Panel A): Work values to pull all 153 residues to 5 A for Cu,Zn(SS) 
WT SOD1, projected on the native SOD1 conformation - work values increases from 
blue to red. The very low correlation length of work values along the primary sequence 
is evident from the discontinuity of the color scheme. (Panel B): A smoothed work 
profile was generated from the all-residue work profile used in panel A and projected 
on the SOD1 conformation. (Panel C) The raw and smoothed work values as a 
function of sequence index. 




Fig. S17. Mean number of highly frustrated contacts in the Cu,Zn(SS) and 
E,E(SS) states of WT SOD1, as a function of residue index. 



The 30 residues mutated to alanine are: 5,10,15,20,25,30, 
35,40,45,50,56,59,65,70,75,80,85,91,96,100,105,110,115, 120,125, 
130,135,139,144 and 150. We again took 50 snapshots from 
the equilibrium simulations of each protein to calculate the 
ensemble average of highly frustrated contacts n HF . We per- 
formed frustratometer analysis for each Cu,Zn(SS) or E,E(SS) 
alanine mutant, and calculated the difference in number of 
highly frustrated residues between WT SOD1 and the alanine 
mutant as a function of sequence index i: ?r^ A (i) — ™w"tW- 
We then averaged this quantity over all alanine mutants, 
(nx^ A (i)) x ~ n& F T (i) = (n HF ) MUT - n^ F T , and plotted 
the result in Figure S18, panel A as a function of sequence 
index. 

Upon mutation to alanine at the sequence locations above, 
the number of highly frustrated contacts increased on aver- 
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age by 9 for Cu,Zn(SS) S0D1 and decreased on average by 
29 for E,E(SS) SOD1. This indicates that mutations which 
normally increase frustration in the Cu,Zn(SS) state of WT 
SOD1 generically decrease frustration in the WT E,E(SS) 
state. This recapitulates the findings in the main text for 
ALS-associated mutations, and supports the conclusion that 
the E,E(SS) state is frustrated. 
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Fig. S18. (Panel A) The number of frustrated contacts within a sphere of radius 
5A centered on each C a atom, is found as a function of residue index, n HF (i). 
Ensemble averages are taken from 50 snapshots in an equilibrium simulation to ob- 
tain n HF (i). This is done for both the Cu.Zn(SS) state and the E,E(SS) state, 
for both the WT sequence, and for 30 alanine mutants (Table S3). The 30 alanine 
mutant sequences are averaged to obtain the ensemble and mutant averaged number 
of contacts as a function of residue index i, (n HF (i)) MUT . Plotted is the differ- 
ence between (n HF (i)) MUT and the corresponding numbers for the WT sequence 
^wt(*) ■ A positive number would indicate an increase in frustration upon muta- 
tion. Cu,Zn(SS) state is shown in blue and has an average of +9 contacts. E,E(SS) 
state is shown in red and has an average of -29 contacts. (Panel B) Distribution of 
the thermal average potential energy change upon mutation, A?7 X — >A, for the 30 
alanine mutants given in the text. Distributions for both the Cu,Zn(SS) state and 
the E,E(SS) state of SOD1 are shown. (Panel C) Scatter plot of the interaction free 
energy of a given residue's side chain with Cu, vs. the mean equilibrium number of 
highly frustrated contacts that residue has in the Cu.Zn(SS) state of WT SOD1. The 
scatter plot shows essentially no correlation. (Panel D) Same as Panel C, but for the 
interaction free energy with Zn. The scatter plot again shows no correlation. 



Potential energy analysis of a non-ALS related alanine scan 
of SODl.We have also calculated the difference in poten- 
tial energy upon mutation, for ALS-associated mutants in the 
main text, and the 30 alanine mutants described above (Ta- 
ble S3). The procedure for calculating the potential energy 
change upon mutation is described in the Methods. The po- 
tential energies, averaged over 50 equilibrium conformations, 
are obtained both before and after mutation, and the differ- 
ence obtained as AUx->a> This difference is obtained in both 
the Cu,Zn(SS) and E,E(SS) forms of SOD1. Plotted in Fig- 
ure S18, panel B are the distributions of AU^^a over the 
above 30 non-ALS associated alanine mutations, for the both 
the Cu,Zn(SS) and E,E(SS) forms of SOD1. The same quan- 
tities are plotted in Figure 4B of the main text for 22 ALS- 
associated mutations. In Figure S18, panel B, we see that all 
30 mutants raise (penalize) the potential energy of Cu,Zn(SS) 
SOD1, while most of the 30 mutants lower the potential energy 
of E,E(SS) SOD1. These findings are consistent with those in 
the main text, and indicate that the results that obtained for 



ALS-associated mutants are also common to non-ALS SOD1 
mutants, and so are a generic feature of SOD1. 

Correlation between metal interactions and frustration, and 
metal interactions and distance. Frustratometer analysis of 
the equilibrium E,E(SS) state of WT SOD1 shows that 90 
residues have at least 1 highly-frustrated contact (Fig. S17). 
The mean number of highly-frustrated contacts of each of 
these residues, 71e,e(ss)5 may be compared with the interaction 
free energy of that residue with the metal AG int. The interac- 
tion free energy is obtained from metal extraction as described 
in the main text and in the Methods below. This analysis di- 
rectly tests the correlation between the degree of frustration 
of a residue, and that residue's involvement in binding met- 
als. Figure 3, panels B and C of the main text show a strong 
correlation between metal interaction free energy and the frus- 
tration of a particular residue in the E,E(SS) state: the more 
frustration a residue has, the stronger its interaction with the 
metals. As a control test, the metal interaction free energy of 
a given residue may be compared with its number of highly- 
frustrated contacts in the Cu,Zn(SS) state. Here we see no 
correlation for either Cu or Zn (Fig. S18, panels C,D). Hence, 
on a residue by residue basis, frustration in the apo state of 
SOD1 facilitates metal affinity for both Cu and Zn. 




P = 3.7e-10 



Distance(A) 




Fig. S19. Panel (A): Residues color-coded by interaction energy with the Cu ion 
(depicted as a cyan sphere). The extent of interaction is strongest in magnitude for 
red colored residues and decreases to blue. Panel (B): same as (A) for the Zn ion 
(depicted as a grey sphere). Panel (C): Interaction energy with Cu correlates with 
the distance of the residue from the Cu ion; residues in close proximity more strongly 
interact with the metal. Panel (D): Interaction energy with Zn does not correlate 
with distance of the residue to the Zn ion, indicating non-local allosteric effects. 



The distance-dependence of the interaction free energy be- 
tween a residue and either Cu or Zn was investigated in Fig- 
ure 5 of the main text. Results were shown for the 24 mutants 
listed in Table S3. When the dataset corresponding to the 90 
residues with at least one highly frustrated contact is used 
(Table S3), the general result remains true that allosteric reg- 
ulation for metal affinity is significantly correlated with the 
proximity of a residue to the Cu binding site, but not with 
the proximity of a residue to the Zn binding site (Fig. S19). 
There is more scatter with this larger dataset (but also selec- 
tive to only include frustrated residues) in the plot measuring 
correlation between interaction free energy and distance to 
the Cu, and the correlation decreases somewhat from that in 
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the main text. The conclusion remains true that the affinity 
for Zn is imposed collectively across the whole protein. 



Relationship between mechanical work, fluctuations, poten- 
tial energy, and frustration. In Figure S20, we plot the cumu- 
lative work distribution, RMSF, time-resolved change in po- 
tential energy after mutation, and frustratometer results for 
the ALS-associated mutants studied here: G127X and A4V. 
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Fig. S20. Panel (A): Cumulative distribution of work values to pull residues to 
5A, for E,E G127X and E,E(SH) WT. Panel (B): same as (A), for E.E(SS) A4V and 
E,E(SS) WT. Panels (C,D): RMSF values as function of residue index, for the same 
proteins in Panels (A,B). Panels (E,F): time-resolved change in total potential energy 
after mutation, from E.E(SH) WT E,E G127X and Cu,Zn(SH) WT ->> Cu.Zn 
G127X (panel E) and from E,E(SS) WT ->> E.E(SS) A4V and Cu,Zn(SS) WT 
Cu.Zn(SS) A4V (panel F). Panel (G): Difference in the number of highly-frustrated 
contacts for each residue, G127X - WT, for Cu.Zn(SH) and E,E(SH) variants. Panel 
(H): Same difference as in (G), for A4V - WT, for Cu.Zn(SS) and E.E(SS) variants. 



These two mutants show different behavior: The cumu- 
lative distribution of work values shows that E,E G127X is 
stabilized with respect to E,E(SH) WT (Fig. IB in the main 
text and Fig. S20A). The mechanical stability of A4V shows 
the opposite effect however (Fig.s S14 and S20B): the muta- 
tion softens the protein. 



The dynamic fluctuations are consistent with the work- 
value analysis: the fluctuations are reduced upon mutation 
for G127X (Fig.2C in main text and Fig. refngcompareC), 
but enhanced for A4V (Fig. reffigcompareD). Likewise, the 
total potential energy in the apo protein decreases upon mu- 
tation for G127X (Fig. S20E) but increases upon mutation 
for A4V (Fig. S20F). The potential energy increases in the 
holo protein for both mutants. 

The frustratometer analysis indicates that the frustration 
increases in the holo protein for both mutants, 5 contacts for 
G127X and 4 contacts for A4V. However for the apo proteins, 
the number of highly frustrated contacts in E,E G127X de- 
creases from E,E(SH) WT by 16 contacts on average, and 
the number of highly frustrated contacts in E,E G127X de- 
creases from that in E,E(SS) WT by 35 contacts. For E,E(SS) 
A4V, frustration is decreased from that in E,E(SS) WT by 31 
contacts. Thus frustration decreases from the E,E(SS) state 
slightly more in G127X than in E,E(SS) A4V. A general anal- 
ysis of the decrease in potential energy and frustration for apo 
mutants, along with the results from their individual mechan- 
ical scans, is an interesting topic for future work. 



Methods 

Steered molecular dynamics simulations. Steered molecular 
dynamics (SMD) using constant-velocity moving restraints 
was used to simulate the action of moving AFM cantilever on 
a protein. The general procedure used two tethering points for 
the pulling simulations. In our study of SOD1 monomer, one 
tether was placed at the position of the alpha carbon closest 
to the center of mass of the protein (generally C a (46)), and 
another tether was placed on the alpha carbon of a particular 
amino acid to be pulled on. To implement a mechanical sta- 
bility scan across the surface of the protein, every 5th amino 
acid along with the first and last residues was selected for a 
pulling simulation. This coarse- grains the mechanical profile, 
so that the work values obtained are a sampling of the work 
values for all residues. 

SMD simulations [13,14,15,16] were preformed in GRO- 
MACS to monitor the forced unfolding of SOD1 variants. An 
all- atom representation of the protein was used, with OPLS- 
AA/L parameters [17,18], and a generalized Born surface area 
(GBSA) implicit solvent model was used, with dielectric con- 
stants of the protein and solvent taken to be 4 and 80 respec- 
tively. The Onufriev-Bashford-Case (OBC) algorithm was 
used to calculate Born radii [19]. The phenomenological sur- 
face tension coefficient used in calculating solvation free en- 
ergies was 0.005 kcal/mol/(A 2 ) [20]. The LINCS algorithm 
was applied to constrain all bond lengths that contained a 
hydrogen atom [21]. 

Prior to simulation, energy was minimized using a steep- 
est descent algorithm to remove any potential steric clashes. 
Simulations were carried out with an integration time-step of 
2 fs and coordinates were saved after every 100 ps. The long- 
distance cut-off used for non-bonded interactions was 14 A for 
both Electrostatic and van der Waals (VDW) interactions. A 
cubic simulation box with periodic boundary conditions was 
used, with size such that all protein atoms were initially at 
least 20 A from any cubic face. For every SOD1 variant pro- 
tein considered, a simulation was run for 20 ns to equilibrate 
the protein before any pulling simulations were performed and 
data was collected. 

Because only moderate deformations were required to con- 
struct mechanical force-extension curves, a very slow pulling 
speed (for simulations) of 2.5 x 10 _3 m/s was admissible. This 
speed is comparable to those used in AFM experiments [22], 
and is 400 — 8000 x slower than the speeds generally used in 
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simulations [23,24,25]. The spring constant of the simulated 
AFM cantilever was taken to be 5 kJ/mol/A 2 . The average 
F(x)/x values from the epitope pulling simulations e.g. for 
residues 10 and 17, in the extension range of 1-5 A, are o.i 
and 7.3 kJ/mol/A 2 respectively, indicating that the protein ef- 
fective spring constant is comparable to the cantilever spring 
constant. The average modulus as determined by 2W(x)/x 2 
between 1 and 5 A is 7.72 and 10.35 kJ/mol/A 2 for residues 
10 and 17 respectively. The average modulus between and 1 
A is much higher: 67.08 for residue 10 and 91.46 for residue 
17 in the same units. Each pulling simulation ran until the 
change in distance between the two tethering points was 5 A. 

Explicit solvent simulations. Pulling simulations of WT SOD1 
were also performed in explicit water. Equilibrated starting 
structures were generated by placing the protein in a cubic 
box of simple point charge (SPC) water, such that all protein 
atoms were initially at least 20 A from any cubic face. This 
required 10866 water molecules to solvate the protein. Prior 
to simulation, the energy of the system was minimized using 
a steepest descent algorithm. The system was then equili- 
brated for 20 ns under a constant volume (NVT) ensemble, 
with temperature maintained at 300 K using the Berendsen 
weak coupling method. Pulling simulations were then carried 
out using an integration time-step of 2 fs, and coordinates 
were saved every 100 ps. The long-distance cut-off used for 
non-bonded electrostatic and VDW interactions was 14 A. 

We have also minimized and equilibrated E,E (SS) WT, 
E,E (SH) WT and E,E G127X variants of SOD1 in explicit 
solvent following the same procedure as described above for 
Cu,Zn (SS) WT SOD1. Each of the variants was equilibrated 
for 35 ns, and the last 20 ns of that trajectory was used for 
the calculation of RMSF and backbone amide SASA. 

Tethering residues. The C a atom of the residues listed in Ta- 
ble S2 were closest to the center of mass of the corresponding 
protein variant, and so were used as the central tether. When- 
ever the pulling residue appeared to be very close to the center 
of mass tethering residue, the tethering residue was shifted to 
the midpoint of the protein sequence to avoid artificially large 
forces. Specifically, for several variants listed in Table S2, 
residues 40, 45, 46, 50, 54 and 55 were either too close to the 
central C a , or were along the same beta strand as the central 
Col , and thus gave anomalous force-extension profiles probing 
covalent bonding topology moreso than non-covalent stabiliz- 
ing interactions. Thus when residues 40, 45, 46, 50, 54 and 55 
were pulled, the tethering residue was moved to corresponding 
residue indicated in Table S2. This always resolved the prob- 
lem of large forces, except for 1-110 SOD1. For this variant, 
we have taken residue 86, which is also close to the center of 
mass, because the midpoint residue (56) did not resolve the 
problem of anomalously large forces. 

Residues used for mechanical scans of protein stability. We 

have previously developed a model calculation of the free en- 
ergy of unfolding specific regions of a protein using the so- 
called single-sequence approximation [1], in analogy to sim- 
ilar models used in protein folding [26,27,28,29,30]. In this 
formalism, the resulting free energy landscape is constructed 
as a function of the center residue of the contiguous strand 
taken to be disordered, and the sequence length of the disor- 
dered strand. In our unfolding model, we have treated elec- 
trostatic and van der Waals energies in the system at the 
level of the CHARMM22 energy function [31,32]. Protein en- 
tropy was calculated from simulations of the unfolded ensem- 
ble in explicit solvent. Regions of low thermodynamic stabil- 



ity may be projected onto the native structure; these regions 
constitute candidate epitopes for unfolding-specific immuno- 
logic therapy, providing that the criteria of residue-specific im- 
munogenicity and uniqueness of the epitope sequence in the 
proteome are established. Antibodies that bind to disease- 
specific protein epitopes corresponding to non-native confor- 
mations displayed by disordered protein sequences have been 
developed for superoxide dismutate 1 (SOD1) [33] and Prion 
protein [34]. The candidate epitopes as determined from free 
energy landscape analysis for WT SOD1 are given in the table 
in Figure SI. 

Every 5th residue along the protein sequence, including 
the first and last residues, was taken as a tethering residue. 
As well, the central positions of the predicted epitopes and 
anti-epitopes were added to the data set. 

The first anti-epitope contains 5 residues with residue 18 
at its center, however the free energy landscape prediction 
gave a larger stability to candidate epitope 1 than epitope 
2 [1], so we chose residue 17 instead as potentially more rep- 
resentative for the stiff sequence of residues. For all other 
epitopes or anti-epitopes, either the center residue was cho- 
sen as the tethering point if the stand contains an odd num- 
ber of residues, or if the strand contained an even number 
of residues, the pulling residue was chosen randomly between 
the two center residues. 

If the center of the epitope happened to coincide with a 
multiple of 5 already included in the scan (e.g. residue 10), 
then the next residue (residue 11 in this case) was also taken. 
Thus a set of 48 residues was used in the mechanical scans: 
(1, 5, 10, 11, 15, 17, 20, 24, 25, 30, 31, 35, 38, 40, 45, 46, 50, 
54, 55, 60, 65, 70, 73, 75, 80, 85, 90, 91, 95, 100, 101, 105, 
108, 110, 115, 120, 121, 125, 130, 135, 136, 140, 141, 145, 146, 
150, 151, 153). 

Proteins considered. Proteins whose mechanical profiles were 
calculated are given in Table S2 below, along with the corre- 
sponding PDB entries used to generate equilibrated structures 
for simulations. 

Modeling proteins with no PDB structure. Some proteins 
studied here have no PDB structural coordinates available; see 
Table S2. For these proteins, a structure was built by modify- 
ing known PDB structures of similar proteins. For example, 
metal-depleted (E,E (SS)) WT SOD1 was created from E,E 
(SS) SOD1 structure(lRK7) by back-mutating Q133E, E50F 
and E51G using the PyMOL software package [35]. Metal- 
depleted, disulfide-reduced (E,E (SH) WT) SOD1 was cre- 
ated from both E,E (SS) SOD1 structure (1RK7) by reducing 
the disulphide linkage and mutating E50F, E51G, and Q133E, 
and also from Cu,Zn (SS) WT SOD1 (1HL5) by reducing the 
disulphide linkage and removing the metals. The E,E G127X 
mutant was created from Cu,Zn (SS) WT SOD1 by remov- 
ing the metals and the last 20 residues, and then mutating 
the last 6 residues of the remaining sequence (KGGNEE) to 
correspond to the frame-shifted non-native C-terminal peptide 
sequence (GGQRWK) prior to the termination sequence. The 
Cu,Zn G127X mutant was prepared following the same way as 
done for E,E G127X - only the metals were kept intact. The 
full-length C57S/C146S mutants were prepared from Cu,Zn 
(SS) WT SOD1 by mutating both Cys-57 and Cys-146 to Ser- 
ine (Cu,Zn C57S/C146S 1-153) and simultaneously removing 
the metals (E,E C57S/C146S 1-153). The truncated versions 
of these variants were made by following the same technique 
as described for full-length proteins - only residues 128-153 
were deleted for the truncated ones. E,E (SH) SOD1 variants 
1-140 and 1-110 were created from the full length E,E (SH) 
WT SOD1 by deleting the last 13 and 43 residues respectively. 
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As described in the Methods subsection on steered MD sim- 
ulations, all modified structures were energy minimized and 
equilibrated for 20 ns by running an equilibrium simulation, 
before performing any pulling assays. 

Go model energetics. The SMOG Go model recipe used 
here [9] takes heavy atoms within 2.5 A, and applies native 
contacts to them with a Lennard- Jones 6-12 potential. The 
SMOG Go model recipe also attributes energies to dihedral 
terms in the BB and SCs. The energy scale determining well- 
depths for these interactions is given through the relation: 
N c e c + Nbb^bb + Nscesc = N A • (lkJ/mol), where N c is the 
number of non-local contacts and e c is their corresponding 
well-depth, and the other numbers correspond to BB and SC 
dihedral numbers and energy scales respectively. Na on the 
right hand side is the number of atoms. The overall energy 
scale of all interactions is thus given by lkJ/mol times the 
number of atoms in the system. 

Umbrella Sampling and Weighted Histogram Analysis 
Method (WHAM). Initial configurations were obtained from 
pulling simulations as described in the Methods subsection 
on steered MD, to obtain 25 initial conditions between and 
5A. These initial configurations are then simulated for 10ns 
each, in an umbrella potential with stiffness 500kJ/mol/nm 2 
to constrain the simulations near their corresponding separa- 
tion distances. The 25 simulations are then used to recon- 
struct the free energy profile along the distance coordinate 
using the weighted histogram analysis method (WHAM) [36] , 
and the free energy difference between and 5A is then ob- 
tained. 

Metal insertion or extraction in calculating interaction ener- 
gies. To obtain free energies of metal binding, metals were in- 
serted into or extracted from the putative binding locations for 
WT SOD1, and all glycine mutants of SOD1. Metals were al- 
ways found to be at least metastable in their putative binding 
positions. To obtain the metal binding free energy, the most 
direct straight-line path was first determined for the pulling 
direction, by finding the approximate direction of highest sol- 
vent exposure of the metal. Tethers were placed on the metal, 
and on the closest C« atom opposite to the direction of high- 
est solvent exposure. For Cu extraction this corresponded to 
tethering the C a atom of residue Phe45, and for Zn removal 
the corresponding C a atom tether was in Asp83. 

The metal was pulled away from the protein using a spring 
constant of 500 kJ/mol/nm 2 and a pull rate of 0.01 nm/ps. 
For purposes of the free energy calculation, the final extension 
from the equilibrium distance between metal and tethering 
residue was taken to be approximately 30 A. From these tra- 
jectories, snapshots were taken to generate the starting con- 
figurations for the umbrella sampling windows [37,38,39]. An 
asymmetric distribution of sampling windows was used, such 
that the window spacing was lA between and 20 A separa- 
tion, and 2 A beyond 20 A of separation. Such spacing allowed 
for increasing detail at smaller separation distances, and re- 
sulted in a total of 25 windows. In each window, 10 ns of MD 
was performed for a total simulation time of 250 ns utilized 
for umbrella sampling. Analysis of the results was performed 
using the weighted histogram analysis method (WHAM) [36] . 

To remove conformational distortion effects on the free 
energy, position restraints were applied to the protein in the 
following way. For a given C a atom in the protein, all other 
Col atoms within 5A were constrained to have a roughly con- 
stant Col-Col distance, the same as in an equilibrated struc- 
ture, using spring constants of 392 x 10 3 kJ/mol/nm 2 . Using 
Col constraints allows the protein to retain its structure un- 



der force, while still allowing the side chains and partially the 
backbone to fluctuate in response to the external perturba- 
tion. After metal extraction, Col constraints were relaxed and 
the relaxation free energy was calculated as described below. 

The free energy for metal extraction was corrected to ac- 
count for protein relaxation in the final metal depleted state: 
AF tot = AFtfZtraZed + &F relax . That is, after metal ex- 
traction, Col constraints are gradually reduced from 392 x 10 3 
kJ/mol/nm 2 to zero using 30 windows and 10ns of relaxation 
time in each window, and the free energy change for this pro- 
cess is again obtained using WHAM. Convergence of the free 
energy values was tested by both varying the number of win- 
dows (to 30, 35 and 45) and varying the length of the equilib- 
rium simulation in each window (to 15, 20, 25 and 30 ns). In 
all cases the free energies were seen to have converged using 
the original protocol. 

To validate that equilibrium free energies have been ob- 
tained by this procedure, thermodynamic cycles were con- 
structed by re-inserting the metal and equilibrating, for WT 
SOD1, and ALS-associated mutants A4V and G93A, as well 
as PTM variants Cu,E(SS) and E,Zn(SS) [11]. The metal 
binding free energies subject to Col constraints were calcu- 
lated, as well as the subsequent relaxation free energies. The 
total free energy change over such cycles was typically less 
than 0.04 kJ/mol, confirming that the cycles are thermody- 
namic to within the error of the simulation. A more detailed 
description of the procedure is given in reference [11]. 

Potential energy calculations. Cu,Zn(SS) and E,E(SS) WT 
SOD1 protein structures were simulated in a cubic box with 
periodic boundary conditions and size such that all protein 
atoms were initially at least 20 A from any cubic face. An 
all- atom representation of the protein was used, with OPLS- 
AA/L parameters [17,18], and a generalized Born surface area 
(GBSA) implicit solvent model [20] was used, with dielectric 
constants of the protein and solvent taken to be 4 and 80 
respectively. The Onufriev-Bashford-Case (OBC) algorithm 
was used to calculate Born radii [19]. The phenomenological 
surface tension coefficient used in calculating solvation free 
energies was 0.005 kcal/mol/(A 2 ) [20]. The LINCS algorithm 
was applied to constrain all bond lengths that contained a hy- 
drogen atom [21]. Prior to simulation, energy was minimized 
using a steepest descent algorithm to remove any potential 
steric clashes. Simulations were carried out with an integra- 
tion time-step of 2 fs and coordinates were saved after every 
100 ps. The long-distance cut-off used for non-bonded inter- 
actions was 14 A for both Electrostatic and van der Waals 
(VDW) interactions. 

For both the Cu,Zn(SS) and E,E(SS) WT SOD1 variants 
of SOD1 protein, a simulation was run for 20 ns to equili- 
brate the protein. Once equilibrium is reached, the potential 
energy was recorded and averaged over 50 snapshots, where 
each snapshot is taken every 0.5ns. The final equilibrated 
conformation was then mutated in PyMol to either one of 
the ALS-associated mutants described in the main text or 
alanine mutants described above. The mutated structures 
were then again equilibrated until convergence of the poten- 
tial energy (typically w 35ns). Then 50 snapshots were taken, 
once every 0.5ns, and these snapshots were used to obtain the 
ensemble- averaged potential energy of the mutant. In this 
way, A[/ wt ^mut is obtained for each mutant. A histogram 
of this quantity is plotted for the 30 non-ALS alanine mutants 
described in the above text, in Figure S18B. 

Frustratometer analysis.Highly frustrated contacts were 
found using the "frustratometer" method of Ferreiro et. al., 
given in reference [12]. This method finds the particular 
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contacts that are highly frustrated in a one protein confor- 
mation. Metal-protein interactions are not included in the 
frustratometer analysis; only protein-protein interactions are 
considered. We have used equilibrium protein conformations 
of the Cu,Zn(SS) and E,E(SS) states of SOD1 to distinguish 
the effect of metals on SOD1 protein structure. An equi- 
librium ensemble of 50 conformations was obtained for both 
WT SOD1 and mutants of SOD1, as described in the previ- 
ous section above. The number of highly frustrated contacts 
within a sphere of 5A around each C a atom was found for ev- 
ery snapshot, and averaged over snapshots. This number was 
again averaged over mutants to obtain the mutant and equilib- 
rium ensemble- averaged number of highly frustrated contacts 
around a given C Q atom. 




Number of Points with AW(>0) Number of Randomly Generated Trajectoriesx 10 6 



Fig. S21. A Statistical significance test between E,E (SH) WT SOD1 and E,E 
G127X cumulative distributions. The black curve denotes the E,E (SH) WT profile, 
and the red curve denotes the E,E G127X profile. The collection of green lines cor- 
responds to 100 randomly-generated cumulative distributions in the following way: 
starting from the E,E (SH) WT work values, gaussian noise with zero mean and 
standard deviation a = 2.7 kJ/mol is added to each work value. The resulting 
collection of work values are then rank-ordered to generate a cumulative distribution, 
and this process is then repeated N times. The collection of blue lines corresponds to 
N = 1000. For large enough N, one will begin to find outlying cumulative distribu- 
tions which deviate as much or more than a given trial distribution, here E,E G127X. 
An example of such an outlier is shown in panel B. In this case, the number of positive 
deviations (differences in work value) between the red and black curves in panel A are 
recorded (27), along with their work values, which are themselves rank ordered largest 
to smallest and plotted in panel C. An outlier at least as extreme as E,E G127X must 
have a set of positive work deviations such that the corresponding curve lies above 
the red curve in panel C. In this case the randomly generated outlier has 32 positive 
deviations in work, 27 of which are larger than those between E,E G127X and E,E 
(SH) WT variants. The number of randomly generated trajectories is increased, until 
the mean number of successful trajectories, corresponding to the criterion in panel C, 
exceeds unity. A plot of the mean number of successful trajectories as a function of 
N is given in panel D. The statistical significance p of a given trajectory is then given 
as p = 1/Ni, where Ni is the number of randomly generated trajectories that 
give an expectation value of unity for the number of successfully-generated outliers. 
In this case, the statistical significance of E,E G127X is about 1/(1.1 X 10 6 ), or 
about 9 X 10 -7 . 



Statistical Analysis. The mechanical profile is a collection of 
48 work values for a given SOD1 variant. Each work value 
measured from our simulations is accurate to some a kJ/mol 
which we determine as follows. The difference in the mechan- 
ical scans of two initial conditions, for example as derived 



from different crystal structures, gives a measure of the error 
in the mechanical scan. For E,E (SH) WT SOD1 shown in the 
main panel of Figure S4, the work difference AW has mean 
0.03k J/mol and standard deviation a = 2.1kJ/mol. A z-test 
indicates that with 93% certainty the values of AW arise from 
a gaussian distribution of mean zero and variance <j - On the 
other hand, the values of AW in the inset A of Figure S4 
for Cu,Zn (SS) WT SOD1 have mean 0.75kJ/mol and stan- 
dard deviation 1.5k J/mol. This distribution fails a z-test for a 
gaussian with mean zero and a Q = 1.5k J/mol, indicating that 
the error in measurement is larger than a Q . We increase the 
standard deviation until the z-test indicates the data arises 
from a gaussian of mean zero and standard deviation a. This 
gives a = 2.7kJ/mol, which we use as a measure of the error 
bars in the work values of the mechanical profile. 

In comparing two mechanical profiles residue by residue, 
the probability of finding the second profile by chance, given 
the measurements W% of the first profile, and the error bars 
a, is: 

P({A W }) = if[«rfc(^gi). [1] 

where |AWi| is the modulus of the difference between the two 
work values for residue z, and N = 48 here. 

Comparing two cumulative distributions is more forgiv- 
ing, in that there is a greater likelihood two different mechan- 
ical profiles may still give the same overall work distribution. 
These two protein variants might then be said to have the 
same overall mechanical stability, although what parts of the 
structure were the most or least stable would differ for each 
protein. 

Because the overall variance of a given work profile is so 
large however (typically « 15 k J/mol), it is very likely to ob- 
tain the cumulative distribution of any one variant from any 
other variant by a direct Kolmogorov-Smirnov test. We are 
interested in a different question however: given one work pro- 
file, its cumulative distribution, and the error bars associated 
with the work values, what is the probability of obtaining a 
cumulative distributions at least as extreme as another given 
one by chance, i.e. as arising from the original work profile? 
This is a complicated question because it is non-trivial to ac- 
count for the combinatorial number of ways of obtaining a 
given cumulative distribution. 

To find the above probability, and thus determine the sta- 
tistical significance of a given cumulative distribution, we em- 
ploy a "Monte Carlo" procedure of generating cumulative dis- 
tributions from a given "baseline" work profile. For example, 
to test the effect of truncation at the C-terminus which me- 
chanically stabilizes the protein, we wish to find the prob- 
ability of obtaining the E,E G127X cumulative distribution 
(or a distribution even more stabilized) from the E,E (SH) 
WT SOD1 cumulative distribution. The E,E G127X cumu- 
lative distribution has 30 work values that are more stable 
than those in the E,E (SH) WT SOD1 cumulative distribu- 
tion; the difference in work values are plotted in Figure S21 
panel C, largest to smallest. We seek cumulative distributions 
that have rank ordered deviations at least as large as those of 
the E,E G127X cumulative distribution. 

Work profiles are constructed by adding gaussian noise 
to a the baseline work profile (here that of E,E (SH) WT 
SOD1), where the gaussian noise has mean zero and standard 
deviation a = 2.7 k J/mol. A cumulative distribution is then 
constructed for each generated profile by sorting the values 
lowest to highest. After constructing N of such cumulative 
distributions, we ask whether one has found a cumulative dis- 
tribution with, e.g. rank-ordered positive deviations at least 
as large as those in the E,E G127X cumulative distribution. 
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The value of N where the expected number of trajectories is 
unity determines the statistical significance: p = 1/N. Plots 
of the corresponding distributions for E,E (SH) WT S0D1 
and E,E G127X S0D1 are given in Figure S21. 

If 2 cumulative distributions are different, their mechani- 
cal fingerprints are necessarily different. If they are the same, 



their fingerprints can always be compared to discern the ef- 
fects of mutation. In practice however we never encountered 
two different fingerprints that gave rise to indistinguishable 
cumulative distributions. 
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Supporting Information Tables 



Table SI. Correlation table of Work(upper diagonal) and RMSF(lower diagonal) values 

for different variants of SOD1 



RMSF/Work 


Cu.Zn (SS) WT 


E,E (SS) WT 


E,E (SH) WT 


E,E G127X 


E,E 1-140 


E,E 1-110 


Cu.Zn (SS) WT 


1/1 


-0.37 


-0.16 


0.20 


0.06 


0.14 


E,E (SS) WT 


0.16 


1/1 


0.51 


-0.17 


-0.03 


0.22 


E,E (SH) WT 


0.10 


0.51 


1/1 


-0.09 


-0.13 


0.38 


E,E G127X 


0.06 


-0.14 


0.13 


1/1 


0.32 


0.62 


E,E 1-140 


0.02 


0.09 


0.11 


0.21 


1/1 


0.56 


E,E 1-110 


0.03 


0.13 


0.11 


0.21 


0.22 


1/1 



Correlations are generally insignificant. The two significant correlations are between E,E G127X and E,E 1-110 (p=1.0e-4), and E,E 
1-140 and E,E 1-110 (p=6.2e-4). 



Table S2. Central tethering residues for mechanical 
profiles 



SOD1 variant 


Tethering residue 


PDB ID used for 






structure generation 


Cu.Zn (SS) WT 


46 (76*) 


1HL5, 2C9V 


E,E (SS) WT 


46 (76*) 


1RK7* 


E,E (SH) WT 


46 (76*) 


1HL5, 1RK7 + 


Cu.Zn G127X 


45 (67*) 


1HL5* 


E,E G127X 


45 (67*) 


1HL5* 


E,E 1-140 


46 (71*) 


1HL5 


E,E 1-110 


46 (86*) 


1HL5 


Cu.Zn C57S/C146S 1-153 


46 (76*) 


1HL5® 


E,E C57S/C146S 1-153 


46 (76*) 


1HL5 


Cu.Zn C57S 1-127 


45 (64*) 


1HL5 


E ; E C57S 1-127 


45 (64*) 


1HL5 9 



*for residue 40, 45, 46, 50, 54 and 55. 

* Structures reported have mutations from the WT sequence; these were "back- mutated" to construct structures for the WT 

sequence, as described in the Methods subsection on modeling proteins with no PDB structure. 

*residues 134-153 deleted and 128-133 mutated. 

e Cys-57 and Cys-146 mutated to Serine. 

e Cys-57 mutated to Serine and residues 128-153 deleted. 



Table S3. Proteins considered for Frustratometer results and Interaction Energy calculations 



Frustratometer 


Interaction Energy 


A4V,G37R,L38V,G41D,G41S,H43R, 


A4G,L8G,I17G,L38G,H43G,H46G, 


H46R,H46R/H48QT54R,D76Y,H80R, 


H46G/H48G,T54G,D76G,H80G,D83G,L84G, 


L84F,G85R,D90A,G93A,G93C,E100G, 


D90G,E100G,I113G,R115G,D124G,D125G, 


I113T,D124V,D125H,S134N,L144F* 


S134G,A140G,R143G,L144G,V148G,ll49G t 



V5A,G10A,Q15A,F20A,S25A, A1G,P13G,E21G,Q22G,K23G,E24G,S25G,N26G,P28G,V31G,S34G, 

K30A,I35A,E40A,F45A,F50A, E40G,E49G,F50G,D52G,N53G,A55G,C57G,T58G,S59G,A60G,P62G, 

G56A,S59A,N65A,K70A,K75A, H63G,F64G,N65G,P66G,L67G,S68G,R69G,K70G,H71G,P74G,K75G, 

H80A,G85A,K91A,D96A,E100A, E77G,E78G,R79G,V81G,N86G,A89G,V94G,A95G,D101G,S102G,V103G, 



A105A,H110A,R115A,H120A,D125A, I104G,S105G,L106G,S107G,D109G,H110G,C111G,I112G,H120G,E121G,K122G, 
G130A,T135A,N139A,L144A,G150A* A123G,L126G,K128G,N131G,E132G,E133G,T135G,K136G,T137G,N139G,S142G @ 

* Proteins used for Figure 3. 

*Proteins used for Figure 5, Figure S18 and Figure S19. 

* Proteins used for Figure S18. 

Proteins used for Figure S18 and Figure S19. 
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Table S4. Mechanical work values of WT SOD1 variants - the values reported in the table depict the work in kJ/mol needed 

to pull a particular residue up to 5 A. t 



Residue 
Index 


Cu.Zn 
(SS) WT 


E,E 
(SS) WT 


E,E 
(SH) WT 


Cu.Zn 
G127X 


E,E 
G127X 


Cu.Zn 1-153 
C57S/C146S 


E,E 1-153 
C57S/C146S 


Cu.Zn 1-127 
C57S 


E,E 1-127 
C57S 


E,E 
1-140 


E,E 
1-110 


1 


66.32 


63.90 


60.32 


62.47 


60.37 


67.12 


61.25 


64.62 


66.37 


61.09 


54.08 


5 


68.85 


74.92 


88.95 


71.83 


75.80 


76.02 


81.27 


70.66 


72.22 


66.40 


68.90 


10 


66.18 


94.68 


81.29 


82.38 


86.41 


92.94 


72.22 


75.83 


61.27 


70.19 


72.00 


11 


108.41 


62.53 


59.51 


68.39 


65.69 


70.20 


66.80 


64.41 


80.27 


64.09 


58.71 


15 


89.97 


71.56 


68.81 


99.48 


104.00 


77.44 


91.12 


75.83 


69.37 


88.09 


76.67 


17 


91.53 


61.60 


45.15 


99.48 


95.69 


73.91 


66.87 


72.83 


72.27 


94.09 


67.84 


20 


78.24 


67.77 


60.67 


67.49 


64.74 


87.41 


82.12 


76.87 


90.29 


92.40 


65.87 


24 


58.82 


54.46 


56.77 


57.99 


55.69 


67.01 


72.22 


69.32 


81.29 


64.09 


52.77 


25 


62.31 


126.04 


87.35 


70.39 


72.65 


73.22 


78.18 


65.93 


77.22 


74.69 


69.98 


30 


42.90 


124.00 


113.40 


51.00 


54.39 


76.48 


59.16 


84.37 


68.22 


73.50 


72.93 


31 


66.38 


76.52 


55.92 


81.20 


85.69 


67.20 


71.29 


65.02 


69.22 


74.09 


65.59 


35 


77.33 


104.09 


84.16 


79.39 


76.61 


66.55 


69.27 


76.93 


72.12 


62.59 


67.33 


38 


67.11 


63.59 


64.74 


79.43 


75.69 


76.01 


80.18 


70.77 


78.20 


54.09 


57.77 


40 


94.41 


98.00 


91.57 


75.48 


73.67 


68.33 


90.19 


87.83 


71.22 


59.20 


64.95 


45 


83.49 


64.31 


68.18 


67.88 


65.04 


76.04 


81.20 


64.83 


61.22 


76.90 


61.17 


46 


63.32 


56.23 


66.83 


79.97 


75.69 


66.12 


71.27 


74.44 


60.27 


84.09 


67.54 


50 


57.08 


112.86 


78.24 


71.98 


73.14 


73.51 


69.37 


70.73 


59.38 


67.59 


63.63 


54 


85.03 


51.12 


71.63 


59.99 


55.69 


60.20 


61.11 


60.84 


67.37 


64.09 


59.27 


55 


88.66 


78.95 


72.65 


94.99 


95.16 


69.05 


58.27 


59.86 


78.22 


80.09 


76.12 


60 


79.50 


79.77 


58.41 


65.00 


69.98 


75.93 


66.27 


65.37 


81.29 


91.19 


66.11 


65 


92.99 


76.29 


83.28 


88.99 


87.99 


86.94 


71.27 


66.83 


61.29 


66.19 


72.21 


70 


80.56 


64.42 


95.86 


68.34 


69.06 


66.07 


58.33 


77.79 


58.48 


94.30 


73.95 


73 


125.06 


43.42 


60.82 


107.09 


105.69 


70.20 


71.22 


94.93 


69.32 


104.09 


78.50 


75 


72.82 


94.37 


71.42 


59.49 


55.12 


78.13 


65.37 


75.94 


66.33 


69.90 


58.25 
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59 40 
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91 


98 44 


46 27 


61 36 


58 40 


55 69 


74 91 


90 44 


76 04 


82 22 


54 09 


53 91 


95 


61.74 


90.15 


82.99 


97.88 


96.10 


84.16 


62.22 


85.33 


90.19 


101.20 


81.48 


100 


97.82 


81.67 


87.44 


57.39 


54.90 


73.11 


71.27 


70.93 


81.22 


78.19 


71.92 


101 


65.32 


75.58 


40.28 


82.39 


85.69 


74.20 


66.66 


75.93 


78.79 


84.09 


66.99 


105 


73.50 


71.67 


63.50 


65.30 


62.21 


80.68 


77.72 


82.02 


71.22 


71.90 


61.38 


108 


69.29 


54.35 


40.29 


77.00 


75.69 


66.12 


81.18 


70.04 


80.12 


64.09 


56.23 


110 


66.24 


96.08 


86.64 


82.47 


79.07 


73.55 


60.38 


65.88 


71.29 


96.50 


55.02 


115 


81.13 


57.64 


72.76 


101.09 


106.83 


76.08 


71.29 


83.90 


66.65 


61.59 




120 


81.67 


58.15 


101.18 


87.30 


85.65 


81.57 


59.33 


66.02 


61.20 


81.19 




121 


113.67 


59.55 


66.77 


78.02 


75.69 


70.44 


56.48 


85.00 


59.43 


84.09 




125 


64.12 


83.58 


82.85 


76.00 


77.76 


70.37 


61.22 


65.32 


66.22 


71.30 




1 0~7 

127 
















o4.0z 


~7n cc\ 

70.69 




* 


130 


87.92 


68.13 


103.14 


71.32 


72.34 


86.75 


71.27 


* 


* 


55.59 


* 


133 








68.03 


66.35 






* 


* 




* 


135 


79.41 


116.09 


85.13 


* 


* 


96.52 


80.69 


* 


* 


62.29 


* 


136 


62.52 


46.75 


70.28 


* 


* 


67.12 


71.22 


* 


* 


74.09 


* 


140 


99.75 


77.40 


92.17 


* 


* 


74.94 


66.66 


* 


* 


57.70 


* 


141 


147.35 


58.33 


56.82 


* 


* 


64.76 


71.22 


* 


* 


* 


* 


145 


72.82 


62.23 


76.21 


* 


* 


66.20 


60.16 


* 


* 


* 


* 


146 


80.53 


45.31 


61.20 


* 


* 


67.12 


61.29 


* 


* 


* 


* 


150 


68.12 


74.84 


80.22 


* 


* 


74.76 


66.44 


* 


* 


* 


* 


151 


76.67 


61.72 


66.17 


* 


* 


73.83 


61.27 


* 


* 


* 


* 


153 


69.05 


69.50 


60.34 


* 


* 


66.12 


60.33 


* 


* 


* 


* 



t "*" indicates residues that are not present in the given variant, while "— " indicates work values that were not measured for a given 
variant. 
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